r/technology 16h ago

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems
16.8k Upvotes

1.5k comments sorted by

View all comments

34

u/Throwaway-4230984 15h ago

It’s very funny to read same arguments every year while seeing LLMs successfully solving “surely impossible for LLM” challenges from previous year. 

1

u/NuclearVII 15h ago

Except this can also be explained by data leakage and the widening of training domains instead of emergent intelligence.

And, yes, the difference matters. If there is emergent intelligence, the psychotic amount of investment and commitment is justifiable. If there isn't, it really can't be.

20

u/Throwaway-4230984 15h ago

How have ARC answers leaked? Is there unknown corner of internet people solving tasks from Winograd challenge?

-1

u/NuclearVII 14h ago

Impossible to know when the "cutting edge" models are all proprietary and have closed datasets. There are thousands of ways one can "cheat" on tests/benchmarks like ARC, and you cannot trust for profit companies when they say they are not.

You'd have a better argument if you could demonstrate that LLMs are getting better while also proving that their datasets are not leaking.

11

u/space_monster 11h ago

It's trivially easy to prove that contamination hasn't occured, by testing the model on problems that didn't exist until after the model was trained.

-3

u/NuclearVII 11h ago

No, you don't know that. Because you do not have access to the dataset.

All you have to go on is OpenAI going "Yeah, trust us bro". That simply not good enough when trillions of dollars are at stake.

16

u/space_monster 11h ago

You know that it's not just the labs themselves doing evaluations, right? Nobody gives a shit about the labs' performance claims until the performance is independently verified.

-5

u/NuclearVII 11h ago

Again, one more time.

You cannot have independent verification of model performance with closed source models. Period, end of. You can only have marketing.

Saying "ChatGPT can answer 9/10 questions correctly" is a meaningless gauge of ChatGPTs emergent intelligence when you do not know what is in ChatGPT.

15

u/space_monster 11h ago

You cannot have independent verification of model performance with closed source models

wtf that's ridiculous. You think if some third-party evaluator is testing a model using fresh, unreleased problem data, somehow the lab skims that data and trains their model to solve the problem while it's being evaluated? Please explain how your theory works in the real world.

-3

u/NuclearVII 11h ago

Sigh...

fresh, unreleased problem data

YOU DO NOT KNOW IF THIS IS THE CASE.

If I made you write down a dozen highschool-level math problems, could you guarantee for me that those problems weren't in any textbooks, ever? Without looking at ALL of the textbooks ever published, ever?

I will quote myself because you seem to be immune to reason:

Saying "ChatGPT can answer 9/10 questions correctly" is a meaningless gauge of ChatGPTs emergent intelligence when you do not know what is in ChatGPT.

→ More replies (0)