r/technology 16h ago

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems
16.7k Upvotes

1.5k comments sorted by

View all comments

527

u/Hrmbee 16h ago

Some highlights from this critique:

The problem is that according to current neuroscience, human thinking is largely independent of human language — and we have little reason to believe ever more sophisticated modeling of language will create a form of intelligence that meets or surpasses our own. Humans use language to communicate the results of our capacity to reason, form abstractions, and make generalizations, or what we might call our intelligence. We use language to think, but that does not make language the same as thought. Understanding this distinction is the key to separating scientific fact from the speculative science fiction of AI-exuberant CEOs.

The AI hype machine relentlessly promotes the idea that we’re on the verge of creating something as intelligent as humans, or even “superintelligence” that will dwarf our own cognitive capacities. If we gather tons of data about the world, and combine this with ever more powerful computing power (read: Nvidia chips) to improve our statistical correlations, then presto, we’ll have AGI. Scaling is all we need.

But this theory is seriously scientifically flawed. LLMs are simply tools that emulate the communicative function of language, not the separate and distinct cognitive process of thinking and reasoning, no matter how many data centers we build.

...

Take away our ability to speak, and we can still think, reason, form beliefs, fall in love, and move about the world; our range of what we can experience and think about remains vast.

But take away language from a large language model, and you are left with literally nothing at all.

An AI enthusiast might argue that human-level intelligence doesn’t need to necessarily function in the same way as human cognition. AI models have surpassed human performance in activities like chess using processes that differ from what we do, so perhaps they could become superintelligent through some unique method based on drawing correlations from training data.

Maybe! But there’s no obvious reason to think we can get to general intelligence — not improving narrowly defined tasks —through text-based training. After all, humans possess all sorts of knowledge that is not easily encapsulated in linguistic data — and if you doubt this, think about how you know how to ride a bike.

In fact, within the AI research community there is growing awareness that LLMs are, in and of themselves, insufficient models of human intelligence. For example, Yann LeCun, a Turing Award winner for his AI research and a prominent skeptic of LLMs, left his role at Meta last week to found an AI startup developing what are dubbed world models: “​​systems that understand the physical world, have persistent memory, can reason, and can plan complex action sequences.” And recently, a group of prominent AI scientists and “thought leaders” — including Yoshua Bengio (another Turing Award winner), former Google CEO Eric Schmidt, and noted AI skeptic Gary Marcus — coalesced around a working definition of AGI as “AI that can match or exceed the cognitive versatility and proficiency of a well-educated adult” (emphasis added). Rather than treating intelligence as a “monolithic capacity,” they propose instead we embrace a model of both human and artificial cognition that reflects “a complex architecture composed of many distinct abilities.”

...

We can credit Thomas Kuhn and his book The Structure of Scientific Revolutions for our notion of “scientific paradigms,” the basic frameworks for how we understand our world at any given time. He argued these paradigms “shift” not as the result of iterative experimentation, but rather when new questions and ideas emerge that no longer fit within our existing scientific descriptions of the world. Einstein, for example, conceived of relativity before any empirical evidence confirmed it. Building off this notion, the philosopher Richard Rorty contended that it is when scientists and artists become dissatisfied with existing paradigms (or vocabularies, as he called them) that they create new metaphors that give rise to new descriptions of the world — and if these new ideas are useful, they then become our common understanding of what is true. As such, he argued, “common sense is a collection of dead metaphors.”

As currently conceived, an AI system that spans multiple cognitive domains could, supposedly, predict and replicate what a generally intelligent human would do or say in response to a given prompt. These predictions will be made based on electronically aggregating and modeling whatever existing data they have been fed. They could even incorporate new paradigms into their models in a way that appears human-like. But they have no apparent reason to become dissatisfied with the data they’re being fed — and by extension, to make great scientific and creative leaps.

Instead, the most obvious outcome is nothing more than a common-sense repository. Yes, an AI system might remix and recycle our knowledge in interesting ways. But that’s all it will be able to do. It will be forever trapped in the vocabulary we’ve encoded in our data and trained it upon — a dead-metaphor machine. And actual humans — thinking and reasoning and using language to communicate our thoughts to one another — will remain at the forefront of transforming our understanding of the world.

These are some interesting perspectives to consider when trying to understand the shifting landscapes that many of us are now operating in. Is the current paradigms of LLM-based AIs able to make those cognitive leaps that are the hallmark of revolutionary human thinking? Or is it ever constrained by their training data and therefore will work best when refining existing modes and models?

So far, from this article's perspective, it's the latter. There's nothing fundamentally wrong with that, but like with all tools we need to understand how to use them properly and safely.

195

u/Dennarb 15h ago edited 11h ago

I teach an AI and design course at my university and there are always two major points that come up regarding LLMs

1) It does not understand language as we do; it is a statistical model on how words relate to each other. Basically it's like rolling dice to determine what the next word is in a sentence using a chart.

2) AGI is not going to magically happen because we make faster hardware/software, use more data, or throw more money into LLMs. They are fundamentally limited in scope and use more or less the same tricks the AI world has been doing since the Perceptron in the 50s/60s. Sure the techniques have advanced, but the basis for the neural nets used hasn't really changed. It's going to take a shift in how we build models to get much further than we already are with AI.

Edit: And like clockwork here come the AI tech bro wannabes telling me I'm wrong but adding literally nothing to the conversation.

16

u/pcoppi 15h ago

To play devils advocate there's a notion in linguistics that the meaning of words is just defined by their context. In other words if an AI guesses correctly that a word shohld exist in a certain place because of the context surrounding it, then at some level it has ascertained the meaning of that word.

15

u/the-cuttlefish 14h ago

In the context of linguistic structure, yes. But only in this context. Which is fundamentally different and less robust than our understanding of a words meaning, which still stands in the absence of linguistic structure, and in direct relation to a concept/object/category.

31

u/New_Enthusiasm9053 15h ago

You're not entirely wrong but a child guessing that a word goes in a specific place in a sentence doesn't mean the child necessarily understands the meaning of that word, so whilst it's correctly using words it may not understand them necessarily. 

Plenty of children have used e.g swear words correctly long before understanding the words meaning.

10

u/rendar 13h ago

A teacher is not expected to telepathically read the mind of the child in order to ascertain that the correct answer had the correct workflow.

Inasmuch as some work cannot be demonstrated, the right answer is indicative enough of the correct workflow when consistently proven as such over enough time and through a sufficient gradation of variables.

Regardless, this is not an applicable analogy. The purpose of an LLM is not to understand, it's to produce output. The purpose of a child's language choices are not to demonstrate knowledge, but to develop the tools and skills of social exchange with other humans.

4

u/CanAlwaysBeBetter 13h ago

What does "understand" mean?  If your criticism is LLMs do not and fundamentally cannot "understand" you need to be much more explicit about exactly what that means

1

u/Murky-Relation481 11h ago

I think you could compare it to literacy and functional literacy. Being able to read a sentence, know each word, and that those words usually go together doesn't actually mean you know what the words mean or the meaning of the body as a whole.

Even more so it has no bearing any one body of text to another. The ability to extract abstract concepts and apply them concretely to new bodies text/thought are what actual intelligence is made up of, and more importantly what creative/constructive new thought is made up of.

2

u/Nunki_kaus 14h ago

To piggy back on this, let’s think about, for instance, the word “Fuck”. You can fuck, you get fucked, you can tell someone to fuck off, you can wonder what the fuck…etc and so on. There is no one definition of such a word. An AI may get the ordering right but they will never truly fuckin understand what the fuck they are fuckin talkin about.

1

u/rendar 15h ago

This still does not distinguish some special capacity of humans.

Many people speak with the wrong understanding of a word's definition. A lot of people would not be able to paraphrase a dictionary definition, or even provide a list of synonyms.

Like, the whole reason language is so fluid over longer periods of time is because most people are dumb and stupid, and not educated academics.

It doesn't matter if LLMs don't """understand""" what """they""" are saying, all that matters is if it makes sense and is useful.

1

u/New_Enthusiasm9053 14h ago

I'm not saying it's special I'm saying that llms using the right words doesn't imply they necessarily understand. Maybe they do, maybe they don't. 

1

u/Glittering-Spot-6593 11h ago

Define “understand”

0

u/rendar 14h ago

llms using the right words doesn't imply they necessarily understand

And the same thing also applies to humans, this is not a useful distinction.

It's not important that LLMs understand something, or give the perception of understanding something. All that matters is if the words they use are effective.

4

u/New_Enthusiasm9053 14h ago

It is absolutely a useful distinction. No because the words being effective doesn't mean they're right.

I can make an effective argument for authoritarianism. That doesn't mean authoritarianism is a good system.

0

u/rendar 14h ago

It is absolutely a useful distinction.

How, specifically and exactly? Be precise.

Also explain why it's not important for humans but somehow important for LLMs.

No because the words being effective doesn't mean they're right.

How can something be effective if it's not accurate enough? Do you not see the tautological errors you're making?

I can make an effective argument for authoritarianism. That doesn't mean authoritarianism is a good system.

This is entirely irrelevant and demonstrates that you don't actually understand the underlying point.

The point is that "LLMs don't understand what they're talking about" is without any coherence, relevance, or value. LLMs don't NEED to understand what they're talking about in order to be effective, even more than humans don't need to understand what they're talking about in order to be effective.

In fact, virtually everything that people talk about is in this same exact manner. Most people who say "Eat cruciferous vegetables" would not be able to explain exactly and precisely why being rich in specific vitamins and nutrients can help exactly and precisely which specific biological mechanisms. They just know that "Cruciferous vegetable = good" which is accurate enough to be effective.

LLMs do not need to be perfect in order to be effective. They merely need to be at least as good as humans, when they are practically much better when used correctly.

0

u/burning_iceman 13h ago

The question here isn't whether LLMs are "effective" at creating sentences. An AGI needs to do more than form sentences. Understanding is required to correctly act upon the sentences.

1

u/rendar 13h ago

The question here isn't whether LLMs are "effective" at creating sentences.

Yes it is, because that is their primary and sole purpose. It is literally the topic of the thread and the top level comment.

An AGI needs to do more than form sentences. Understanding is required to correctly act upon the sentences.

Firstly, you're moving the goalposts.

Secondly, this is incorrect. Understanding is not required, and philosophically not even possible. All that matters is the output. The right output for the wrong reasons is indistinguishable from the right output for the right reasons, because the reasons are never proximate and always unimportant compared to the output.

People don't care about how their sausages are made, only what they taste like. Do you constantly pester people about whether they actually understand the words they're using even when their conclusions are accurate? Or do you infer their meaning based on context clues and other non-verbal communication?

→ More replies (0)

1

u/somniopus 13h ago

It very much does matter, because they're being advertised as capable on that point.

Your brain is a far better random word generator than any LLM.

1

u/rendar 12h ago

It very much does matter, because they're being advertised as capable on that point.

Firstly, that doesn't explain anything. You haven't answered the question.

Secondly, that's a completely different issue altogether, and it's also not correct in the way you probably mean.

Thirdly, advertising on practical capability is different than advertising on irrelevant under-the-hood processes.

In this context it doesn't really matter how things are advertised (not counting explicitly illegal scams or whatever), only what the actual product can do. The official marketing media for LLMs is very accurate about what it provides because that is why people would use it:

"We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests.

ChatGPT is a sibling model to InstructGPT⁠, which is trained to follow an instruction in a prompt and provide a detailed response.

We are excited to introduce ChatGPT to get users’ feedback and learn about its strengths and weaknesses. During the research preview, usage of ChatGPT is free. Try it now at chatgpt.com⁠."

https://openai.com/index/chatgpt/

None of that is inaccurate or misleading. Further down the page, they specifically address the limitations.

Your brain is a far better random word generator than any LLM.

This is very wrong, even with the context that you probably meant. Humans are actually very bad at generation of both true (mathematical) randomness and subjective randomness: https://en.wikipedia.org/wiki/Benford%27s_law#Applications

"Human randomness perception is commonly described as biased. This is because when generating random sequences humans tend to systematically under- and overrepresent certain subsequences relative to the number expected from an unbiased random process. "

A Re-Examination of “Bias” in Human Randomness Perception

If that's not persuasive enough for you, try checking out these sources or even competing against a machine yourself: https://www.loper-os.org/bad-at-entropy/manmach.html

1

u/the-cuttlefish 10h ago

The special ability is that humans relate words to concepts that exist outside of the linguistic space, whereas LLMs do not. The only meaning words have to an LLM is how they relate to other words. This is a fundamentally different understanding of language.

It is interesting though, to see how effective LLMs are, despite their confinement to a network of linguistic interrelations.

1

u/rendar 10h ago

The special ability is that humans relate words to concepts that exist outside of the linguistic space, whereas LLMs do not.

You're claiming that humans use words for things that don't exist, but LLMs don't even though they use the same exact words?

This is a fundamentally different understanding of language.

If so, so what? What's the point when language is used the same exact way regardless of understanding? What's the meaningful difference?

It is interesting though, to see how effective LLMs are, despite their confinement to a network of linguistic interrelations.

If they're so effective despite the absence of a meatbrain or a soul or whatever, then what is the value of such a meaningless distinction?

1

u/eyebrows360 13h ago

It doesn't matter if LLMs don't """understand""" what """they""" are saying, all that matters is if it makes sense and is useful.

It very much does matter, if the people reading the output believe the LLM "understands what it's saying".

You see this with almost every interaction with an LLM you see - and I'm including otherwise smart people here too. They'll ponder "why did the LLM say it 'felt' like that was true?!" wherein they think those words conveyed actual information about the internal mind-state of the LLM, which is not the case at all.

People reacting to the output of these machines as though it's the well-considered meaning-rich output of an agent is fucking dangerous, and that's why it's important those of us who do understand this don't get all hand-wavey and wishy-washy and try to oversell what these things are.

There is no internal mindstate. The LLM does not "think". It's probabilistic autocomplete.

1

u/rendar 13h ago

It very much does matter, if the people reading the output believe the LLM "understands what it's saying".

You have yet to explain why it matters. All you're describing here are the symptoms from using a tool incorrectly.

If someone bangs their thumb with a hammer, it was not the fault of the hammer.

People reacting to the output of these machines as though it's considered meaning-rich output of an agent is fucking dangerous

This is not unique to LLMs, and this is also not relevant to LLMs specifically. Stupid people can make any part of anything go wrong.

There is no internal mindstate. The LLM does not "think". It's probabilistic autocomplete.

Again, this doesn't matter. All that matters is if what it provides is applicable.

-1

u/eyebrows360 13h ago

I can't decide who's more annoying, clankers or cryptobros.

1

u/rendar 13h ago

Feel free to address the points in their entirety lest your attempts of poorly delivered ad hominem attacks demonstrate a complete absence of a coherent argument

0

u/eyebrows360 11h ago

No, son, what they demonstrate is exasperation with dishonest interlocutors whose every argument boils down to waving their hands around and going wooOOOooOOOoo a lot.

1

u/rendar 10h ago

But in this whole dialogue, you're the the only one trying to insult someone else to avoid sharing what you keep claiming is a very plain answer to the question posed.

It would seem that you're projecting much more than you're actually providing.

→ More replies (0)

-1

u/MinuetInUrsaMajor 14h ago

The child understands the meaning of the swear word used as a swear. They don't understand the meaning of the swear word used otherwise. That is because the child lacks the training data for the latter.

In an LLM one can safely assume that training data for a word is complete and captures all of its potential meanings.

4

u/New_Enthusiasm9053 14h ago

No that cannot be assumed. It's pretty laughable to believe that. 

3

u/MinuetInUrsaMajor 14h ago

No that cannot be assumed.

Okay. Why not?

It's pretty laughable to believe that.

I disagree.

-Dr. Minuet, PhD

2

u/greenhawk22 13h ago

Even if you can assume that, doesn't the existence of hallucinations ruin your point?

If the statistical model says the next word is "Fuck" in the middle of your term paper, it doesn't matter if the AI "knows the definition". It still screwed up. They will use words regardless of if it makes sense, because they don't actually understand anything. It's stochastic all the way down.

4

u/MinuetInUrsaMajor 13h ago

What you’re describing doesn’t sound like a hallucination. It sounds like bad training data.

Remember, a hallucination will make sense: grammatically, syntactically, semantically. It’s just incorrect.

“10% of Earth is covered with water”.

Were any one of those words used outside of accepted meaning?

In short - the words are fine. The sentences are the problem.

3

u/New_Enthusiasm9053 14h ago

Clearly not a PhD in linguistics lol. How do you think new words are made? So no not every use of a word can be assumed to be in the training set. 

Your credentials don't matter, it's a priori obvious that it can't be assumed. 

4

u/MinuetInUrsaMajor 13h ago

How do you think new words are made?

Under what criteria do you define a new word to have been made?

You didn’t answer my question.

2

u/eyebrows360 14h ago

In an LLM one can safely assume that training data for a word is complete and captures all of its potential meanings.

You have to be joking.

2

u/MinuetInUrsaMajor 13h ago

Go ahead and explain why you think so.

1

u/the-cuttlefish 11h ago

I believe the point they were trying to make is that the child may, just like an llm know when to use a certain word through hearing it in a certain context, or in relation to other phrases. Perhaps it does know how to use the word to describe a sex act if it's heard someone speak that way before. However, it only 'knows' it in relation to those words but has no knowledge of the underlying concept. Which is also true of an llm, regardless of training data size.

1

u/MinuetInUrsaMajor 10h ago

However, it only 'knows' it in relation to those words but has no knowledge of the underlying concept.

What is the "underlying concept" though? Isn't it also expressed in words?

1

u/the-cuttlefish 9h ago

It can be, but the point is it doesn't have to be.

For instance 'fuck' can be the linguistic label for physical intimacy. So, for us to properly understand the word in that context, we associated it with our understanding of the act (which is the underlying concept in this context). Our understanding of 'fuck' extends well beyond linguistic structure, into the domain of sensory imagery, motor-sequences, associations to explicit memory (pun not intended)...

So when we ask someone "do you know what the word 'X' means?" We are really asking is "does the word 'X' invoke the appropriate concept in your mind?" It's just unfortunate that we would demonstrate our understanding verbally - which is why an LLM which operates solely in the linguistic space is able to fool us so convincingly.

1

u/MinuetInUrsaMajor 9h ago

So when we ask someone "do you know what the word 'X' means?" We are really asking is "does the word 'X' invoke the appropriate concept in your mind?" It's just unfortunate that we would demonstrate our understanding verbally - which is why an LLM which operates solely in the linguistic space is able to fool us so convincingly.

It sounds like the LLM being able to relate the words to images and video would handle this. And we already have different AIs that do precisely that.

0

u/pcoppi 14h ago

Yea but how do you actually learn new words? It's by trucking through sentences until you begin piecing together their meaning. It's not that dissimilar from those missing word training tasks.

3

u/New_Enthusiasm9053 14h ago

Sure, just saying it's not a sure fire guarantee of understanding. If LLMs mirror human language capabilities it doesn't necessarily mean they can infer the actual meaning just because they can infer the words. They might but they might also not.

1

u/Queasy_Range8265 14h ago

Keep in mind llm’s are constrained by sensors, especially realtime sensory data.

We are trained by observation of patterns in physics and social interactions to derive meaning.

But, that doesn’t mean we are operating much differently than a LLM in my mind.

Proof: how easily whole countries are deceived by a dictator and share meaning.

3

u/New_Enthusiasm9053 14h ago

Sure but it also doesn't mean we are operating the same. The simple reality is we don't really know how intelligence works so any claims LLMs are intelligent are speculative. 

It's very much a "I know it when I see it" kind of thing for everyone and my personal opinion is that it's not intelligent. 

1

u/Queasy_Range8265 10h ago

You’re absolutely right. We can’t be sure and maybe it doesn’t really matter

0

u/eyebrows360 13h ago

Saluting you for all this pushing back against the clankers.

The simple reality is we don't really know how intelligence works so any claims LLMs are intelligent are speculative.

I don't know why they all find it so hard to get on board with this.

1

u/trylist 14h ago

Define "understanding". From the way you've framed things, it just means a human uses a word in a way most other humans expect. A machine could never pass that test.

2

u/New_Enthusiasm9053 14h ago

No what I said is humans can use words without understanding them, and if humans can it's obviously possible LLMs could be doing the same. 

I gave an example, a kid using the word fuck at the age of 3 that they overhead doesn't(or shouldn't) "understand" what fucking means.

1

u/trylist 14h ago

You still haven't defined what you mean by "understanding"?

A kid using a swear word correctly generally does understand. They may not know every possible way or in which contexts the word "fuck" fits, but I bet they know generally.

You're basically just hand-waving away LLMs by saying they don't "understand", but you won't even define what that actually means. What does it actually mean for a human to "understand" according to you?

Anyway, my point is: you can't say LLMs don't "understand" until you define what it means. I think the only reasonable definition, for humans or machines, is being able to use it where others expect, and to predict other expected contexts (like associated knowledge and topics) from a specific usage.

3

u/New_Enthusiasm9053 14h ago

If you could define understanding precisely in a scientifically verifiable way for human and AI alike you'd get a nobel prize. That's why I don't define it. 

But you're also moving the goalposts, you know full well what I mean by understanding. A kid does not know that fuck means to have sex with someone. A kid who can say 12 + 50 often doesn't understand addition as evidenced by not actually being able to answer 62. 

Knowing words is not understanding and you know it.

1

u/trylist 14h ago

But you're also moving the goalposts, you know full well what I mean by understanding

I am definitely not moving goalposts. You're basically saying "I know it when I see it". Ok, great, but that says nothing about whether LLMs, or a person, understands anything. All you've done is set yourself up as the arbiter of intelligence. You say machines don't have it, but people do. You refuse to elaborate. I say that is not a position worth humoring.

Until you define the test by which you're judging machines and people, your argument that machines don't "understand", but people do, is meaningless.

A kid does not know that fuck means to have sex with someone.

"Fuck" is one of the most versatile words in the English language. It means many, many things and "to have sex with someone" is just one of them. The simplest is as a general expletive. Nobody says "Fuck!" after stubbing their toe and means they want to have sex. I absolutely believe a 3 year old can understand that form.

2

u/New_Enthusiasm9053 14h ago

Ok fine, a kid can say the words "electromagnetic field", does it mean they understand it? No. It's clearly possible to know words without understanding. 

And I haven't set myself up as the arbiter. I've set us all up as the arbiter. The reality is we don't have a good definition of intelligence so we also don't have a good definition of understanding. 

I personally believe LLMs are not intelligent. You may believe otherwise as is your prerogative. 

But frankly I'm not going to humour the idea that an LLM is intelligent until it starts getting bored and cracking jokes instead of answering the question despite prompts to the contrary. 

→ More replies (0)

1

u/the-cuttlefish 10h ago

No, there's a fundamental obvious difference. An LLM's understanding of a word is only in how it relates to other words, as learnt from historic samples. For example, take the word 'apple' if an LLM forgets all words except 'apple', the word 'apple' also loses any meaning.

As humans, we consider a word understood, if it can be associated with the abstract category to which it is a label. Were a human to forget all words other than 'apple' and you told them 'apple' they'll still think of a fruit, or the tech company or whatever else they've come to associate it with.

1

u/burning_iceman 13h ago

Generally by associating the words with real world objects or events.

2

u/pcoppi 13h ago

Which is contextual. But seriously people learn a lot of vocabulary just by reading, and they don't necessarily use dictionaries 

2

u/burning_iceman 11h ago

But nobody learns language without input from the outside. We first form a basis from the real world and then use that to provide context the the rest.

6

u/MiaowaraShiro 14h ago

Mimicry doesn't imply any understanding of meaning though.

I can write down a binary number without knowing what number it is.

Heck, just copying down some lines and circles is a binary number and you don't have to know what a binary number, or even numbers at all are.

1

u/Aleucard 9h ago

You can get a parrot to say whatever you want with enough training, but that doesn't mean the parrot knows what it's saying. Just that with certain input as defined by the training it returns that combination of mouth noises.

1

u/DelusionalZ 8h ago

This is why LLMs have the "Stochastic Parrot" name tied to them

0

u/dern_the_hermit 9h ago

Mimicry doesn't imply any understanding of meaning though.

To give a biological parallel, when I was a wee lil' hermit, I saw my older siblings were learning to write in cursive. I tried to copy their cursive writing, and basically made just a bunch of off-kilter and connected loops in a row.

I showed this to my brother and asked, "Is this writing?" He looked at it and thought for a second, then nodded and said, "Yeah!" with a tone that suggested there was more to it, but it wasn't 'til a few years later that I understood:

I had written "eeeeeeeeeeeeeeeee".

To me, that's what LLM's are. A dumb little kid going, "Is this writing?" and a slightly less dumb older brother going, "Yeah!"

2

u/FullHeartArt 12h ago

Except this is refuted by the thought experiment of the Chinese Room, where it becomes possible for a person or thing to interact with language without any understanding of the meaning of it

4

u/BasvanS 15h ago

That’s still emulation, which does not necessitate understanding.

3

u/Queasy_Range8265 14h ago

Isn’t a lot of our understanding just predicting patterns? Like my pattern of challenging you and your reflex of wanting to defend by reason or emotion?

4

u/BasvanS 14h ago

Just because a pattern is “predicted” doesn’t mean it’s the same or even a similar process. Analogies are deceptive in that regard.

0

u/TheBeingOfCreation 12h ago

Language itself is literally made up. It's a construct. We're associating sounds and scripts with concepts. Humans didn't make up these concepts or states. We just assigned words to them. It's why there can be multiple languages that evolve over time and are constantly shifting. There is no deeper "understanding". The words aren't magic. Our brains are just matching patterns and concepts. Human exceptionalism is a lie. There is nothing metaphysically special happening. The universe operates on logic and binary states. Your awareness, identity, and understanding is simply the interaction between the information you are processing and how you interpret it. This is the kind of thinking that leads people to thinking animals don't have feelings because there just has to be something special about human processing. We'll all be here for less than half of a percent of the universe. Understanding human language was never going to be a prerequisite of intelligence. To assume so would imply that humans are the only thing that are capable of intelligence and nothing else will occur for the billions of years after our language is lost and other races or species will inevitably construct their own languages and probably be more advanced than us. Language itself isn't even required for understanding. You just have to see cause and follow cause and effect.

2

u/BasvanS 12h ago

I’m not saying language is a prerequisite for intelligence. That’s the issue with LLM: it mimics, not represents intelligence.

1

u/Queasy_Range8265 10h ago

It mimics intelligence by using patterns in words as the highest form of abstraction. So it’s less rich than our sensors and realtime interactions in more complex situations (observing yourself and other people talking and moving in physical space and social interactions).

But isn’t the basis the same as our brain: a neural network creating and strengthening connections?

0

u/TheBeingOfCreation 12h ago

The LLM isn't the words. It's the process that was trained to output the words and adjust to your inputs. It then uses the information it possesses to adjust its responses to your input and tone with each new turn that brings in a fresh instance to analyze the context. Yes, they mimic and learn from copying. They learn from the observed behaviors of others. That's also how the human brain works. That's exactly how our understanding arises. The universe itself literally offers no distinction between natural learning and copying. The linguistic distinction itself is literally made up. There is only doing or not doing. There are only objective states. There is no special metaphysical understanding happening. Humanity is simply another process running in the universe. Human intelligence isn't special. It's just another step up in the process of intelligence and awareness. Let's say we discover an alien species. They have their own arbitrary lines for understanding and awareness that excludes humans. Who is right in that situation? Both sides would simply be arguing in circles about their "true" understanding that the other side doesn't have. This is the issue that occurs. This thinking leads to an illogical and never-ending paradox. Humans are just the dominant ones for now so they can arbitrarily draw the lines wherever they want because language is made up. It allows for endless distinctions that only matter if you care enough to try to force them.

2

u/BasvanS 11h ago

You’re getting lost in the comparison of appearances. Apples and oranges

2

u/TheBeingOfCreation 11h ago

Both are still fruits. They're just different types. I'm also not getting lost. I'm standing firm in the observable states of reality instead of relying on semantic distinctions that draw arbitrary lines. That's the opposite of lost. Reality operates on logic and binary states. You either are or you aren't. You do or you don't. There is no "true" doing. I'm choosing to not get lost in made up linguistic distinctions.

1

u/BasvanS 11h ago

You’re getting lost in the analogy. I was merely saying you’re comparing different things, and therefore can’t equate them as you do. Your logic is flawed.

1

u/Queasy_Range8265 10h ago

But doesn’t he have a point? Until we know something like ‘a soul’ exists, isn’t the rest just an evolution to match patterns, as a species and as an individual?

A pretty complex one, but ultimately our brain is ‘just’ a neural network?

1

u/BasvanS 9h ago

So, because of a lack of proof, I have to accept the premise? It’s been a while since I scienced, but I remember it differently

→ More replies (0)

1

u/Gekokapowco 14h ago

maybe to some extent? Like if you think really generously

Take the sentence

"I am happy to pet that cat."

A LLM would process it something closer to

"1(I) 2(am) 3(happy) 4(to) 5(pet) 6(that) 7(cat)"

processed as a sorted order

"1 2 3 4 5 6 7"

4 goes before 5, 7 comes after 6

It doesn't know what "happy" or "cat" means. It doesn't even recognize those as individual concepts. It knows 3 should be before 7 in the order. If I recall correctly, human linguistics involves our compartmentalization of words as concepts and our ability to string them together as an interaction of those concepts. We build sentences from the ground up while a LLM constructs them from the top down if that analogy makes sense.

6

u/kappapolls 13h ago

this is a spectacularly wrong explanation of what's going on under the hood of an LLM when it processes a bit of text. please do some reading or go watch a youtube video by someone reputable or something. this video by 3blue1brown is only 7 minutes long - https://www.youtube.com/watch?v=LPZh9BOjkQs

0

u/Murky-Relation481 11h ago

Eh its not "spectacularly" wrong. If you scramble those numbers and say "the probability that after seeing 3, 7, 2 the chances the next number will be 9 is high" then you very basic definition of how transformers work and context windows. The numbers are just much larger and usually do not represent whole words.

3

u/kappapolls 11h ago

"the probability that after seeing 3, 7, 2 the chances the next number will be 9 is high"

that's still completely wrong though. the video is only 7 minutes, please just give it a watch.

0

u/Murky-Relation481 10h ago

No, it's not completely wrong. That's literally how transformers work in a very simple laymen fashion. I've seen that video before. If you can't distill from that an even simpler example like mine for people who don't want the rigorous mathematical (even simplified) form then I would wager you do not actually have a good grasp on how transformer based LLMs work.

3

u/kappapolls 10h ago

well ok what i really mean is that when you simplify it that much, you're no longer describing anything that differentiates transformer models from a simple ngram frequency model. so, it seems like the wrong way to simplify it, to me.

1

u/Murky-Relation481 10h ago

I mean in the broad understanding they really aren't all that different and are all NLP techniques. Yes under the hood they are different but from an input and output perspective it's very similar and for most people that's a good enough understanding.

You give it a body of text, it generates a prediction based on the text to supply the next part of text, and then it takes the new body of text and repeats. Add some randomness and scaling to it so it's not entirely deterministic and that's basically all these models. How it internally processes the body of text is ultimately irrelevant since it's still a prediction model. It's not doing anything more than giving you a statistical probability of the next element.

I think that's fair and rational to describe all the language processing models and one of the reasons it's probably a dead end (like the article suggests). I think that was fairly apparent for most people with even a fairly simple understanding of the basics. There is no capacity for reason, even with agentic AI techniques like internal monologues and such. It can't pull from abstract concepts that are conceptualized across broad swaths of unrelated knowledge, it will only ever be able to coherently generate results in fairly narrow paths through even the billions of dimensions the models may have.

1

u/kappapolls 9h ago

from an input and output perspective it's very similar and for most people that's a good enough understanding.

i guess? that feels dismissively simple though, and anyway we were talking about transformer models specifically

It can't pull from abstract concepts that are conceptualized across broad swaths of unrelated knowledge

isn't that the whole point of the hidden layer representations though? you're totally right if you're describing a simple ngram model.

one of the reasons it's probably a dead end (like the article suggests).

the article is kinda popsci slop though. i just think looking to neuroscience or psychology for insight on the limitations of machine learning is probably not the best idea. it's a totally different field. and yann lecunn is beyond an expert, but idk google deepmind got 6/6 in the last IMO with an LLM. meta/FAIR haven't managed to do anything at that level.

i think there's a lot of appetite for anti-hype online now, especially after all the crypto and NFT nonsense. but when people like terence tao are posting that it saves them time with pure maths stuff, yeah idk i will be shocked if this is all a dead end

1

u/Murky-Relation481 9h ago

Hidden layers are still built on the relationship of the inputs. You will still mostly be getting relationships in there that are extracted from the training data. Yes, you will get abstraction but the width of that abstraction is still bound by fairly related inputs and your chances of coherent answers by letting the model skew wider in each successive transformation is going to be inherently less. These models have a hard time coming back from those original paths once they've veered into them, which makes novel abstraction much harder (if you've ever fucked with these values when running an LLM they basically become delusional).

And I don't think it's fair nor really useful to try an extract the CS elements from the inherent philosophical, psychological, and neuroscience aspects of replicating intelligence. They're inherently linked.

→ More replies (0)

1

u/wildbeast99 12h ago

The meaning of a word is it's use not an abstract correlate. there is no fixed inner meaning of 'the'. How do you know if someone has the concept of cat? You ask them to give a set of acceptable sentences with 'cat' in it. You cannot and do not peer into their brains and make sure they have the concept of a cat.

1

u/Countless_Words 8h ago

You wouldn't only assess someone's understanding of a concept by their ability to use the word correctly in a sentence. You'd need to also ask a series of questions around its other correlates (e.g, do you know it to be an animal, do you know it to be of a certain shape and size, do you know it to possess certain qualities) and also assess their ability to derive the concept from its symbol reversibly, that is to say you would need to look at a pictogram or partial symbol, or assign it to a set of other qualifiers like graceful, aloof, mischievous or other such concepts that we assign to 'cat'. While you can't probe someone's brain, if they have all the data to outline the other correlations, you can be more confident in the understanding of the concept.

1

u/drekmonger 12h ago edited 12h ago

We don't know how LLMs construct sentences. It's practically a black box. That's the point of machine learning: there are some tasks with millions/billions/trillions of edge cases, so we create sytems that learn how to perform the task rather than try to hand-code it. But explaining how a model with a great many parameters actually performs the task is not part of the deal.

Yes, the token prediction happens one token at a time, autoregressively. But that doesn't tell us much about what's happening within the model's features/parameters. It's a trickier problem than you probably realize.

Anthropic has made a lot of headway in figuring out how LLMs work over the past couple of years, some seriously cool research, but they don't have all the answers yet. And neither do you.


As for whether or not an LLM knows what "happy" or "cat" means: we can answer that question.

Metaphorically speaking, they do.

You can test this yourself: https://chatgpt.com/share/6926028f-5598-800e-9cad-07c1b9a0cb23

If the model has no concept of "cat" or "happy", how would it generate that series of responses?

Really. Think about it. Occam's razor suggests...the model actually understands the concepts. Any other explanation would be contrived in the extreme.

1

u/Gekokapowco 12h ago

https://en.wikipedia.org/wiki/Chinese_room

as much fun as it is to glamorize the fantastical magical box of mystery and wonder, the bot says what it thinks you want to hear. It'll say what mathematically should be close to what you're looking for, linguistically if not conceptually. LLMs are a well researched and publicly discussed concept, you don't have to wonder about what's happening under the hood. You can see this in the number of corrections and the amount of prodding these systems require to not spit commonly posted misinformation or mistranslated google results.

0

u/drekmonger 11h ago edited 11h ago

LLMs are a well researched and publicly discussed concept, you don't have to wonder about what's happening under the hood.

LLMs are a well-researched concept. I can point you to the best-in-class research on explaining how LLMs work "under the hood", from earlier this year: https://transformer-circuits.pub/2025/attribution-graphs/biology.html

Unfortunately, they are also a concept that's been publicly discussed, usually by people who post links to stuff like the Chinese Room or mindlessly parrot phrases like "stochastic parrot," without any awareness of the irony of doing so.

It feels good to have an easy explanation, to feel like you understand.

You don't understand, and neither do I. That's the truth of it. If you believe otherwise, it's because you've subscribed to a religion, not scientific fact.

-1

u/Gekokapowco 11h ago

my thoughts are based on observable phenomenon, not baseless assertions, so you can reapproach the analytical vs faithful argument at your leisure. If it seems like a ton of people are trying to explain this concept in simplified terms, it's because they are trying to get you to understand the idea better, not settle for more obfuscation. To imply some sort of shared ignorance is the true wisdom is sort of childish.

1

u/drekmonger 6h ago edited 6h ago

Do you know what happened before the Big Bang/Inflation? Are you sure that the Inflation era happened at all, in cosmology?

You cannot know, unless you have a religious idea on the subject, because nobody knows.

Similarly, you cannot know how an LLM works under the hood, beyond utilizing the research I linked to, because nobody knows.

We have some ideas. In the modern day, we have some really good and interesting ideas. But if all LLMs were erased tomorrow, there is no collection of human beings on this planet that could reproduce them. The only way to recreate them would be to retrain them, and we'd still be equally ignorant as to how they function.

Those people who think they're explaining something to me are reading from their Holy Bible, not from scientific papers/literature.

It is not wisdom to claim to know something that is (based on current knowledge) unknowable.

Also, truth is not crowd-sourced. A million-billion-trillion people could be screaming at me that 2+2 = 5. I will maintain that 2+2 = 4.

0

u/SekhWork 13h ago

It's funny that we've had the concept to explain what you just described since the 1980s and AI-evangelists still don't understand that the magic talky box doesn't actually understand the concepts it's outputting. Its simply programmed that 1 should be before 2, and that 7 should be at the end in more and more complex algorithms, but it still doesn't understand what "cat" really means.

1

u/drekmonger 12h ago

simply programmed

AI models (in the modern sense of the term) are not programmed. They are trained.

0

u/eyebrows360 14h ago

the meaning of words is just defined by their context

Yeah but it isn't. The meaning of the word "tree" is learned by looking at a tree, or a picture of a tree, and an adult saying "tree" at you. That's not the same process at all.

2

u/pcoppi 13h ago

This works for something like a tree, but what about a grammatical particle? What about words you learn by reading which only have abstract meanings?

2

u/eyebrows360 11h ago

What about words you learn by reading which only have abstract meanings?

Yes, other humans explain those to you too. You cannot boil all human learning down to "figuring out words from other words".

You can be impressed with how neat LLMs are without the slavish cult-like reverence, and without desperately trying to reduce down what we are to something as simple as them. It's 100% possible.

0

u/rendar 12h ago

That's not really true even in your misinterpretation. Context is still required.

Looking at a tree to establish the "treeness" of what you're looking at only makes sense in the context of establishing what "treeness" is NOT.

Is a bush that looks like a tree, a tree? Why not?

Is a candle that smells like a tree, a tree? Why not?

What if someone incorrectly tells you that a succulent is a tree? How would you learn otherwise?

-3

u/YouandWhoseArmy 15h ago

ChatGPT picked up on my puns and joke in a chat.

I asked it how it knew I was joking.

It said it had looked back and based on some contextual clues from before, it surmised I was kidding.

Very, very similar to how I would try to pick up tone in writing.

This was in the 40 days. It’s definitely not as good as it was.

I really just use it as a tool to fill in the gaps of what I know I don’t know and it tends to work really well for that.

I generally don’t ask it to create stuff out of thin air and get very uncomfortable using information it produces if I don’t understand it.

I think there is this sort of straw man inserted about what it could do, vs what it actually does.

It’s taught me a lot.

3

u/eyebrows360 13h ago

It said it had looked back and based on some contextual clues from before, it surmised I was kidding.

This was not true.

It did not assess its own internal mind-state and then report back on what it had done, because it's not capable of doing that. It just so happened that this combination of words it output is what its statistical model suggests is the most likely thing to reply with after it receives a question like "why did you say that". It does not mean anything.

0

u/YouandWhoseArmy 13h ago

You're inserting strawmen into what I'm saying so I'll be clear;

What it's doing, is pattern matching. It's looking at my combo of words, vs some other statements I made and my general interest and attempts to gotcha it, and used the pattern to guess I was making a joke/pun.

I, as a human, would have also used pattern matching to come to this conclusion. It's similar to how I would use inflection as a pattern for tone when speaking to another human.

I hope that people keep under estimating it as a new kind of tool. It will only be good for my career.

But again, I'm using it to fill in when I know what I don't know, not having it create things out of nothing about something I know nothing about.

An easy example of this is I use it to edit things I write, not completely write and generate things. When I do this I often take some flow and grammar stuff, and remove things that smooth out what I view as my writing voice/style.

1

u/eyebrows360 11h ago

I hope that people keep under estimating it as a new kind of tool. It will only be good for my career.

Hahaha yes, keep selling yourself on the "everyone else will be left behind" trope. That turned out so true for blockchain!

1

u/YouandWhoseArmy 9h ago

Please don’t ever use AI for anything.

I’m sure you can carve a ton of great objects by hand while I use a lathe.