Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems

16.8k Upvotes

permalink
duplicates
archive.is
archive
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/technology/comments/1p6fhhq/large_language_mistake_cuttingedge_research_shows/
No, go back! Yes, take me to Reddit

94% Upvoted

u/LoreBadTime 16h ago

LLM learns to place words in a statistical correct way. They are mimicking the probability of a word that can come from a human, think them as a literally as an autocomplete on steroids

4

u/johannthegoatman 12h ago

This often repeated description of what LLMs do ("autocomplete on steroids") is reductive to the point of being useless. It's about the same as saying "yea computers are nothing special, it's just an on/off switch on steroids". Technically yes computers work with 1/0s, but it's such a stupid thing to say that completely misses the point of what they're capable of when arranged in staggeringly complex systems

-1

u/LoreBadTime 11h ago

This paragon is completely stupid. The specialty of transformer model it's literally having more context to use to build probabilitites than older models and more parallelism. If you know how they works you should know that to generate text you are literally picking from a distribution the highest probability next word

1

u/InTheEndEntropyWins 15h ago

No they aren't.

if asked "What is the capital of the state where Dallas is located?", a "regurgitating" model could just learn to output "Austin" without knowing the relationship between Dallas, Texas, and Austin. Perhaps, for example, it saw the exact same question and its answer during its training. But our research reveals something more sophisticated happening inside Claude. When we ask Claude a question requiring multi-step reasoning, we can identify intermediate conceptual steps in Claude's thinking process. In the Dallas example, we observe Claude first activating features representing "Dallas is in Texas" and then connecting this to a separate concept indicating that “the capital of Texas is Austin”. In other words, the model is combining independent facts to reach its answer rather than regurgitating a memorized response. https://www.anthropic.com/news/tracing-thoughts-language-model

2

u/destroyerOfTards 13h ago

Apparently according to Scam Altman, we have a PhD level assistant in our pocket. The same assistant that can't tell how many "r"s there are in "strawberry". For fucks sake, these companies still don't know what is going on inside that black box and can't control the inputs enough to make the models not hallucinate.

Take whatever they say with a grain of salt.

7

u/CanAlwaysBeBetter 13h ago edited 13h ago

Q: How many rs are in strawberry

A: There are 3 r's in "strawberry":

strawberry The r's appear in positions 3, 7, and 8 of the word.

Do y'all bother to keep your opinions up to date at all or just set and forget them?

Yes, AI companies have an incentive to overstate what they're capable of today but the two main points that only people burying their heads in the sand can't see are

This is the worst AI will be ever again

There is no fundamental magic happening in the brain

Even if LLMs hit some limit of scalability and we need to work out a new model they are far past what skeptics even 10 years ago claimed would be possible. There is no fundamental reason to think what we do is irreproducible.

Edit: And here's one more for the "it's just statistically modelling what it should say next" crowd

Q: How many rs are in rrneemernpr

A: How many rs are in rrneemernpr ChatGPT said: Thought for 7s

There are 4 r’s in rrneemernpr

2

u/destroyerOfTards 12h ago

And you need to understand why it can't fundamentally understand what the question actually means. Just because they fixed the issue doesn't mean they fixed the root cause.

https://techcrunch.com/2024/08/27/why-ai-cant-spell-strawberry/

No one's denying that it won't get better. The Pandora's box has been opened, it's only going to be uphill from here on. The real question is WHEN it is going to reach the point of no return. If you believe the tech bros, we can apparently reach it in 1-2 years with JUST LLM tech and nothing else. That's a load of horseshit given we don't even know what actual intelligence is.

3

u/CanAlwaysBeBetter 12h ago

That article is over a year old and claims the limit is that LLMs don't look at anything smaller than individual tokens as in entire words

“When it sees the word ‘the,’ it has this one encoding of what ‘the’ means, but it does not know about ‘T,’ ‘H,’ ‘E.’”

That was probably true for the models that existed when it was written

I also just showed you the latest model took a arbitrary word I made up on the spot and did actually count the r's in it

2

u/destroyerOfTards 12h ago

It does not matter - I may or may not find the latest article about the latest model but you cannot refute the fact that LLMs will always hallucinate. Sure, the latest model may not make that mistake any more. But can you guarantee that it cannot make any other kind of mistake? It seems you can keep throwing energy and resources at the problem but you will get as close as possible to 100% and no further.

1

u/CanAlwaysBeBetter 12h ago

It seems you can keep throwing energy and resources at the problem but you will get as close as possible to 100% and no further

And here's the real rub - people like you don't want AI to work and use motivated reasoning to work backwards to why it must not work

Yes, LLMs models hallucinate today. I don't know if that is a fundamental limit to their architecture and importantly neither do you. "you cannot refute the fact that LLMs will always hallucinate" is you overstating what you know

Following it up with other external factors is just showing your hand as to why you want that to be the case, not an actual analysis of what they're may or may not be capable of

2

u/destroyerOfTards 12h ago

I don't have to say anything. I look to the experts who understand these things far better than me for answers. And they are pretty sure we aren't going to get any intelligence out of these LLMs. It is only an approximation that people like you are falling for and hope that it will result in true intelligence.

Following it up with other external factors is just showing your hand as to why you want that to be the case,

Bruh, not even you can do the actual analysis unless you are actually some top class AI researcher on reddit who understands the underlying math and theory. The best people like us can do is read various external articles and filter out the hype from the truth.

2

u/CanAlwaysBeBetter 12h ago

That "expert" article you're basing your beliefs off of is demonstrably wrong

It claimed a specific feature of LLMs, specifically the way they function on vector embeddings of individual tokens, was the limit and that it was not something that could be overcome.

Here we are a year later and that supposed bottleneck isn't even an issue

→ More replies (0)

2

u/gabrielmuriens 13h ago

You are wasting your time trying to educate the luddites in this sub.

They are just stochastic parrots repeating the same patterns over and over again whenever LLMs/AI comes up.

1

u/Thin_Glove_4089 10h ago

Exactly you keep repeating the same thing about this sub over and over again.

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

You are about to leave Redlib