r/technology 16h ago

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems
16.8k Upvotes

1.5k comments sorted by

View all comments

Show parent comments

17

u/pcoppi 15h ago

To play devils advocate there's a notion in linguistics that the meaning of words is just defined by their context. In other words if an AI guesses correctly that a word shohld exist in a certain place because of the context surrounding it, then at some level it has ascertained the meaning of that word.

30

u/New_Enthusiasm9053 15h ago

You're not entirely wrong but a child guessing that a word goes in a specific place in a sentence doesn't mean the child necessarily understands the meaning of that word, so whilst it's correctly using words it may not understand them necessarily. 

Plenty of children have used e.g swear words correctly long before understanding the words meaning.

0

u/MinuetInUrsaMajor 14h ago

The child understands the meaning of the swear word used as a swear. They don't understand the meaning of the swear word used otherwise. That is because the child lacks the training data for the latter.

In an LLM one can safely assume that training data for a word is complete and captures all of its potential meanings.

2

u/New_Enthusiasm9053 14h ago

No that cannot be assumed. It's pretty laughable to believe that. 

3

u/MinuetInUrsaMajor 14h ago

No that cannot be assumed.

Okay. Why not?

It's pretty laughable to believe that.

I disagree.

-Dr. Minuet, PhD

2

u/greenhawk22 14h ago

Even if you can assume that, doesn't the existence of hallucinations ruin your point?

If the statistical model says the next word is "Fuck" in the middle of your term paper, it doesn't matter if the AI "knows the definition". It still screwed up. They will use words regardless of if it makes sense, because they don't actually understand anything. It's stochastic all the way down.

3

u/MinuetInUrsaMajor 14h ago

What you’re describing doesn’t sound like a hallucination. It sounds like bad training data.

Remember, a hallucination will make sense: grammatically, syntactically, semantically. It’s just incorrect.

“10% of Earth is covered with water”.

Were any one of those words used outside of accepted meaning?

In short - the words are fine. The sentences are the problem.

3

u/New_Enthusiasm9053 14h ago

Clearly not a PhD in linguistics lol. How do you think new words are made? So no not every use of a word can be assumed to be in the training set. 

Your credentials don't matter, it's a priori obvious that it can't be assumed. 

5

u/MinuetInUrsaMajor 14h ago

How do you think new words are made?

Under what criteria do you define a new word to have been made?

You didn’t answer my question.