this post was submitted on 11 Jul 2025

99 points (98.1% liked)

Fuck AI

3448 readers

742 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 1 year ago

MODERATORS

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

99

Why is AI so wrong all the time??? (lemmy.today)

submitted 2 days ago by [email protected] to c/[email protected]

79 comments fedilink hide all child comments

There have been multiple things which have gone wrong with AI for me but these two pushed me over the brink. This is mainly about LLMs but other AI has also not been particularly helpful for me.

Case 1

I was trying to find the music video from where a screenshot was taken.

I provided o4 mini the image and asked it where it is from. It rejected it saying that it does not discuss private details. Fair enough. I told it that it is xyz artist. It then listed three of their popular music videos, neither of which was the correct answer to my question.

Then I started a new chat and described in detail what the screenshot was. It once again regurgitated similar things.

I gave up. I did a simple reverse image search and found the answer in 30 seconds.

Case 2

I wanted a way to create a spreadsheet for tracking investments which had xyz columns.

It did give me the correct columns and rows but the formulae for calculations were off. They were almost correct most of the time but almost correct is useless when working with money.

I gave up. I manually made the spreadsheet with all the required details.

Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources? I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 1 points 1 hour ago

AI as we know them today, will give you the most statistically probable series of data that fit the prompt.

You're not providing any information on which AI you used so what can we say? For all we know you used a highschoolers senior project trained on failed history essays.

[–] [email protected] 7 points 12 hours ago (1 children)

no, they aren't processing high quality data from multiple sources. They're giving you a statistical average of that data. They will always be wrong by nature. Hallucinations cannot be eliminated. Anyone saying otherwise (irrelevant of how rich they are) is bullshitting.

[–] [email protected] 1 points 10 hours ago (2 children)

If hallucinations cannot be eliminated, how are they decreasing them (allegedly)?

[–] [email protected] 2 points 9 hours ago

Actually according to studies, the most recent versions of all the major LLMbecile vendors are hallucinating more, not less.

[–] [email protected] 1 points 10 hours ago (1 children)

by special casing a lot of things. Like expert systems, in the 80s

[–] [email protected] 1 points 7 hours ago (1 children)

What do you mean?

[–] [email protected] 2 points 6 hours ago (1 children)

the "guardrails" they mention. They are a bunch of if/then statements looking to work around methods that the developers have found to produce undesirable outputs. It doesn't ever mean "the llm will not bo doing this again". It means "the llm wont do this when it is asked in this particular way", which always leaves the path open for "jailbreaking". Because you will almost always be able to ask a differnt way that the devs (of the guardrails, they don't have much control over the llm itself) did not anticipate.

Expert systems were kind of "if we keep adding if/then statements, we would eventually cover all the bases and get a smart, reliable system". That didn't work then. It won't work now either

[–] [email protected] 1 points 3 hours ago

I have experienced this first hand. Asking LLMs explicit things leads to “I can’t help you with that” but if I ask it in a roundabout way, it gives a straight answer.

[–] [email protected] 3 points 14 hours ago (1 children)

I highly recommend modern day oracles or bullshit machines, two professors explain it beautifully

[–] [email protected] 2 points 3 hours ago (1 children)

Bookmarked for watching/reading this week. Will let you know my thoughts.

[–] [email protected] 1 points 2 hours ago

Cool, enjoy!

[–] [email protected] 6 points 17 hours ago (1 children)

it's by design. They are literally just guessing at what part of their database should be put in next, based on the next most likely word. There is no real point to them, because they cannot know things and they are not intelligent. Check out the works of Timnit Gebru if you'd like to know more.

[–] [email protected] 1 points 13 hours ago

What is they saying about AGI?

[–] [email protected] 17 points 1 day ago (1 children)

LLMs are not designed to give you objective factual answers. They're designed to guess what you want to hear, like a middle school student writing a book report for a book they never read.

[–] [email protected] 1 points 3 hours ago (1 children)

I don’t think it considers what the user wants to hear. It is concerned about what the data it has trained on would consider a logical answer.

[–] [email protected] 1 points 3 hours ago

What the user wants to hear is usually biased in the question. "Why are vaccines good" will have a different response from "Why are vaccines bad"

Both may or may not include factual information (again, middle school student guessing at a reading assignment analogy), but they're shaped by the questioner to reaffirm your own biases.

[–] [email protected] 2 points 18 hours ago (1 children)

It did give me the correct columns and rows but the formulae for calculations were off.

Did you tell it that? Assuming you were using an AI chat, you have the opportunity to provide additional info and have it try again.

Getting better success from LLM is a process of providing more context and refining things over iterations

For example I wanted it to generate a python data structure for me, along with lookup functions to cross reference the data. However I gave it further info about the data structures, the cross-mapping and how I wanted it normalized, and iterated a few times until I got something worth copy-pasting sections

[–] [email protected] 2 points 13 hours ago

I did. It did not help

[–] [email protected] 0 points 14 hours ago (1 children)

¯\_(ツ)_/¯ would I get downvoted for saying skill issue lol?

I have recently used llms for troubleshooting/assistance with exposing some self hosted services publicly through a VPS I recently got. I'm not a novice but I'm no pro either when it comes to the Linux terminal.

Anyway long story short in that instance the tool (llm) was extremely helpful in not only helping me correctly implement what I wanted but also explaning/teaching me as I went. I find llms are very accurate and helpful for the types of things I use it for.

But to answer your question on why llms can be wrong, it's because they are guessing machines that just pick the next best word. They aren't smart at all, they aren't "ai" they are large language models.

[–] [email protected] -1 points 13 hours ago (1 children)

It spouts out generic and outdated answers when asked specific questions, which I can identify as wrong (skill issue, lol).

If you are super confident with using them, maybe you are really not knowledgeable enough about those things. Skill issue, I guess.

[–] [email protected] 2 points 5 hours ago (1 children)

There is a lot of hard data showing they are effective tools when used correctly, I realize we are in "FuckAI" and you're likely biased. Just looks at this whole comment section of people talking about how they use the tools effectively.

[–] [email protected] 1 points 3 hours ago

I became biased after I used the products. I have no ethical concerns about AI, like most of this community.

[–] [email protected] 7 points 1 day ago

Case1 isn't a good use case of AI, Case 2 you're going to want a higher quality model than o4. 4.1 is better at math and analysis, claude 4 is probably more accurate at this use case

[–] [email protected] 11 points 1 day ago (1 children)

LLMs are curve fitting the function of “input text” to “expected output text”.

So when you give it an input text, it generates an output text interpolated from the expected outputs for similar inputs.

That means it’s often right for very common prompts and often wrong for prompts that are subtly different from common prompts.

[–] [email protected] 2 points 1 day ago

This is my observation as well. Generic questions are answered well but specific situations are not.

[–] [email protected] 6 points 1 day ago (3 children)

LLM image processing doesn’t work the same way reverse image lookup does.

Tldr explanation: Multimodal LLMs turn pictures into a ~~thousand~~ 200-500 or so ~~words~~ tokens, but reverse image lookups create perceptual hashes of images and look the hash of your uploaded image up in a database.

Much longer explanation:

Multimodal LLMs (technically, LMMs - large multimodal models) use vision transformers to turn images into tokens. They use tokens for words, too, but these tokens don’t also correspond to words. There are multiple ways this could be implemented, but a common approach is to break the image down into a grid, then transform each “patch” of a specific size, e.g., 16x16, into a single token. The patches aren’t transformed individually - the whole image is processed together, in context - but it still comes out of it with basically 200 or so tokens that allow it to respond to the image, the same way it would respond to text.

Current vision transformers also struggle with spatial awareness. They embed basic positional data into the tokens but it’s fragile and unsophisticated when it comes to spatial awareness. Fortunately there’s a lot to explore in that area so I’m sure there will continue to be improvements.

One example improvement, beyond improved spatial embeddings, would be to use a dynamic vision transformers that’s dependent on the context, or that can re-evaluate an image based off new information. Outside the use of vision transformers, simply training LMMs to use other tools on images when appropriate can potentially help with many of LMM image processing’s current shortcomings.

Given all that, asking an LLM to find the album for you is like - assuming you’ve given it the ability and permission to search the web - like showing the image to someone with no context, then them to help you find what music video - that they’ve never seen, by an artist whose appearance they describe with 10-20 generic words, none of which are their name - it’s in, and to hope there were, and that they remembered, the specific details that would make it would come up in the top ten results if searched for on Google. That’s a convoluted way to say that it’s a hard task.

By contrast, reverse image lookup basically uses a perceptual hash generated for each image. It’s the tool that should be used for your particular problem, because it’s well suited for it. LLMs were the hammer and this problem was a torx screw.

Suggesting you use - or better, using a reverse image lookup tool itself - is what the LLM should do in this instance. But it would need to have been trained to think to suggest this, capable of using a tool that could do the lookup, and have both access and permission to do the lookup.

Here’s a paper that might help understand the gaps between LMMs and tasks built for that specific purpose: https://arxiv.org/html/2305.07895v7

load more comments (3 replies)

[–] [email protected] 10 points 1 day ago

I was thinking about the question here and how to reframe it so that it answers itself. I think I have the right way to ask the question:

Why is a hyper-advanced game of mad libs so wrong all the time?

That would get across the point, I think.

[–] [email protected] 15 points 1 day ago (2 children)

Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources?

Well that's the thing. LLMs don't generally "process" data as humans would. They don't understand the text they're generating. So they can't check their answers against reality.

(Except for Grok 4, but it's apparently checking its answers to make sure they agree with Elon Musk's Tweets, which is kind of the opposite of accuracy.)

I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.

As someone who lived through the dotcom boom of the 2000s, and the crypto booms of 2017 and 2021, the AI boom is pretty obviously yet another fad. The point is to make money - from both consumers and investors - and AI is the new buzzword to bring those dollars in.

[–] [email protected] 7 points 1 day ago

Don’t forget IoT, where the S stands for security! Or “The Cloud”! Make sure to rebuy the junk we will deprecate in 2 years time because we love electronic waste and planned obsolescence ;)

[–] [email protected] 4 points 1 day ago (1 children)

AI is definitely a bubble and it is going to crash the stock market one day, along with bitcoin

[–] [email protected] 1 points 1 day ago (1 children)

I can't wait to buy stocks when that day comes

[–] [email protected] 2 points 3 hours ago

It can’t be that far away. We have been waiting since so many years. Trump is also making an effort to crash the market.

[–] [email protected] 97 points 2 days ago* (last edited 2 days ago) (25 children)

Aren’t they processing high quality data from multiple sources?

Here's where the misunderstanding comes in, I think. And it's not the high quality data or the multiple sources. It's the "processing" part.

It's a natural human assumption to imagine that a thinking machine with access to a huge repository of data would have little trouble providing useful and correct answers. But the mistake here is in treating these things as thinking machines.

That's understandable. A multi-billion dollar propaganda machine has been set up to sell you that lie.

In reality, LLMs are word prediction machines. They try to predict the words that would likely follow other words. They're really quite good at it. The underlying technology is extremely impressive, allowing them to approximate human conversation in a way that is quite uncanny.

But what you have to grasp is that you're not interacting with something that thinks. There isn't even an attempt to approximate a mind. Rather, what you have is a confabulation engine; a machine for producing plausible fictions. It does this by creating unbelievably huge matrices of words - literally operating in billions of dimensions at once, graphs with many times more axes than we have letters - and probabilistically associating them with each other. It's all very clever, but what it produces is 100% fake, made up, totally invented.

Now, because of the training data they've been fed, those made up answers will, depending on the question, sometimes ends up being right. For certain types of question they can actually be right quite a lot of the time. For other types of question, almost never. But the point is, they're only ever right by accident. The "AI" is always, always constructing a fiction. That fiction just sometimes aligns with reality.

load more comments (25 replies)

[–] [email protected] 3 points 1 day ago

The thing about LLMs is that they "store" information about the shape of their training models, not about the information contained therein. That information is lost.

A LLM will produce text that looks like the texts it was trained with, but it only can only reproduce any information contained in them if it's common enough in its training data to statistically affect their shape, and even then it has a chance to get it wrong, since it has no way to check its output for fact accuracy.

Add to that that most models are pre-prompted to sound confident, helpful, and subservient (the companies' main goal not being to provide information, but to get their customers hooked on their product and coming back for more), and you get the perfect scammers and yes-men. Auto-complete mentalists that will give you as much confident sounding information shaped nonsense as you want, doing their best to agree with you and confirm any biases you might have, with complete disregard for accuracy, truth, or the effects your trust in their output might have (which makes them extremely dangerous and addictive for suggestible or intellectually or emotionally vulnerable users).

load more comments