There have been multiple things which have gone wrong with AI for me but these two pushed me over the brink. This is mainly about LLMs but other AI has also not been particularly helpful for me.
Case 1
I was trying to find the music video from where a screenshot was taken.
I provided o4 mini the image and asked it where it is from. It rejected it saying that it does not discuss private details. Fair enough. I told it that it is xyz artist. It then listed three of their popular music videos, neither of which was the correct answer to my question.
Then I started a new chat and described in detail what the screenshot was. It once again regurgitated similar things.
I gave up. I did a simple reverse image search and found the answer in 30 seconds.
Case 2
I wanted a way to create a spreadsheet for tracking investments which had xyz columns.
It did give me the correct columns and rows but the formulae for calculations were off. They were almost correct most of the time but almost correct is useless when working with money.
I gave up. I manually made the spreadsheet with all the required details.
Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources? I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.
¯\_(ツ)_/¯ would I get downvoted for saying skill issue lol?
I have recently used llms for troubleshooting/assistance with exposing some self hosted services publicly through a VPS I recently got. I'm not a novice but I'm no pro either when it comes to the Linux terminal.
Anyway long story short in that instance the tool (llm) was extremely helpful in not only helping me correctly implement what I wanted but also explaning/teaching me as I went. I find llms are very accurate and helpful for the types of things I use it for.
But to answer your question on why llms can be wrong, it's because they are guessing machines that just pick the next best word. They aren't smart at all, they aren't "ai" they are large language models.
It spouts out generic and outdated answers when asked specific questions, which I can identify as wrong (skill issue, lol).
If you are super confident with using them, maybe you are really not knowledgeable enough about those things. Skill issue, I guess.