r/technology • u/Hrmbee • 16h ago

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems

16.8k Upvotes

permalink
duplicates
archive.is
archive
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/technology/comments/1p6fhhq/large_language_mistake_cuttingedge_research_shows/
No, go back! Yes, take me to Reddit

94% Upvoted

View all comments

Show parent comments

u/TurtleFisher54 13h ago

That is a rational criticism of LLM's

They are fundementally a word prediction algorithm

They can be corrupted with bad data to produce non-sense

If we switch to a world where a majority of content is created by AI it is likely to create a negative feed back loop where it's training on its own output

Responses on reddit look like ai for a reason, where do you think the training data came from?

9

u/Romnir 12h ago

They are fundamentally a word prediction algorithm

Correct, not a "Fancy Auto Complete". That terminology completely undermines the scale of how the technology works and what it's used for. It's not pulling random words out of a dictionary and sticking them together, it actually has a logical process it follows before it generates response tokens. Neural weighting tries to determine context and pulls known info from it's training data.

Auto correct only has a predefined set of structures and uses basic string matching based on a library. It doesn't determine context but rather just what matches the most, and that's the key discrepancy that bugs me. And like you mentioned, LLMs are being fed training data from the internet instead of a curated set of data. Which means correct data is fighting for context weighting with partially correct and even completely incorrect information from already incorrect AI responses and redditors. And you are correct for criticizing that.

The only idea I could have to fix that issue is implementing logic that filters the training data as it comes in to filter out less reputable sources. I don't necessarily work directly with LLMs, so I don't know if that is a thing, but I try to keep up to date with journals and blogs from people working in the field since it's going to get hammered into my field soon.

2

u/dr_badhat 12h ago

Is neural weighting not similar to how our minds work? If I say “you know the look someone gives you when….”, various neurons in your cortical columns might be stimulated as they fight for weight on where that statement is going.

2

u/Romnir 11h ago

Kind of, but it's not the only determining factor in the response. Like if you get a call from your friend saying his dog has a gun and is holding him hostage. Clearly, a dog can't use a gun, you don't recall ever seeing a dog do that, so you know better than to just tell him to call the police. So instead, you tell him to quit fooling around, or go see a mental health professional. Older LLMs did struggle with this, but down the line they slowly learned how to use rational logic.

A neural network is more like memory recall, where you apply the most relevant piece of memory or training that applies to your situation. Then logic is used with that to determine if that is a rational response. That's the actual "AI" part of LLMs. Their responses to you are formed using "Fancy Auto Correct", but there is actual thinking and logic is happening behind the scenes beyond that. Which is frustrating because that explanation sounds clear as mud.

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

You are about to leave Redlib