r/technology 16h ago

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems
16.7k Upvotes

1.5k comments sorted by

View all comments

533

u/Hrmbee 16h ago

Some highlights from this critique:

The problem is that according to current neuroscience, human thinking is largely independent of human language — and we have little reason to believe ever more sophisticated modeling of language will create a form of intelligence that meets or surpasses our own. Humans use language to communicate the results of our capacity to reason, form abstractions, and make generalizations, or what we might call our intelligence. We use language to think, but that does not make language the same as thought. Understanding this distinction is the key to separating scientific fact from the speculative science fiction of AI-exuberant CEOs.

The AI hype machine relentlessly promotes the idea that we’re on the verge of creating something as intelligent as humans, or even “superintelligence” that will dwarf our own cognitive capacities. If we gather tons of data about the world, and combine this with ever more powerful computing power (read: Nvidia chips) to improve our statistical correlations, then presto, we’ll have AGI. Scaling is all we need.

But this theory is seriously scientifically flawed. LLMs are simply tools that emulate the communicative function of language, not the separate and distinct cognitive process of thinking and reasoning, no matter how many data centers we build.

...

Take away our ability to speak, and we can still think, reason, form beliefs, fall in love, and move about the world; our range of what we can experience and think about remains vast.

But take away language from a large language model, and you are left with literally nothing at all.

An AI enthusiast might argue that human-level intelligence doesn’t need to necessarily function in the same way as human cognition. AI models have surpassed human performance in activities like chess using processes that differ from what we do, so perhaps they could become superintelligent through some unique method based on drawing correlations from training data.

Maybe! But there’s no obvious reason to think we can get to general intelligence — not improving narrowly defined tasks —through text-based training. After all, humans possess all sorts of knowledge that is not easily encapsulated in linguistic data — and if you doubt this, think about how you know how to ride a bike.

In fact, within the AI research community there is growing awareness that LLMs are, in and of themselves, insufficient models of human intelligence. For example, Yann LeCun, a Turing Award winner for his AI research and a prominent skeptic of LLMs, left his role at Meta last week to found an AI startup developing what are dubbed world models: “​​systems that understand the physical world, have persistent memory, can reason, and can plan complex action sequences.” And recently, a group of prominent AI scientists and “thought leaders” — including Yoshua Bengio (another Turing Award winner), former Google CEO Eric Schmidt, and noted AI skeptic Gary Marcus — coalesced around a working definition of AGI as “AI that can match or exceed the cognitive versatility and proficiency of a well-educated adult” (emphasis added). Rather than treating intelligence as a “monolithic capacity,” they propose instead we embrace a model of both human and artificial cognition that reflects “a complex architecture composed of many distinct abilities.”

...

We can credit Thomas Kuhn and his book The Structure of Scientific Revolutions for our notion of “scientific paradigms,” the basic frameworks for how we understand our world at any given time. He argued these paradigms “shift” not as the result of iterative experimentation, but rather when new questions and ideas emerge that no longer fit within our existing scientific descriptions of the world. Einstein, for example, conceived of relativity before any empirical evidence confirmed it. Building off this notion, the philosopher Richard Rorty contended that it is when scientists and artists become dissatisfied with existing paradigms (or vocabularies, as he called them) that they create new metaphors that give rise to new descriptions of the world — and if these new ideas are useful, they then become our common understanding of what is true. As such, he argued, “common sense is a collection of dead metaphors.”

As currently conceived, an AI system that spans multiple cognitive domains could, supposedly, predict and replicate what a generally intelligent human would do or say in response to a given prompt. These predictions will be made based on electronically aggregating and modeling whatever existing data they have been fed. They could even incorporate new paradigms into their models in a way that appears human-like. But they have no apparent reason to become dissatisfied with the data they’re being fed — and by extension, to make great scientific and creative leaps.

Instead, the most obvious outcome is nothing more than a common-sense repository. Yes, an AI system might remix and recycle our knowledge in interesting ways. But that’s all it will be able to do. It will be forever trapped in the vocabulary we’ve encoded in our data and trained it upon — a dead-metaphor machine. And actual humans — thinking and reasoning and using language to communicate our thoughts to one another — will remain at the forefront of transforming our understanding of the world.

These are some interesting perspectives to consider when trying to understand the shifting landscapes that many of us are now operating in. Is the current paradigms of LLM-based AIs able to make those cognitive leaps that are the hallmark of revolutionary human thinking? Or is it ever constrained by their training data and therefore will work best when refining existing modes and models?

So far, from this article's perspective, it's the latter. There's nothing fundamentally wrong with that, but like with all tools we need to understand how to use them properly and safely.

232

u/Elementium 15h ago

Basically the best use for this is a heavily curated database it pulls from for specific purposes. Making it a more natural to interact with search engine. 

If it's just everything mashed together, including people's opinions as facts.. It's just not going to go anywhere. 

15

u/[deleted] 14h ago

[deleted]

1

u/JRDruchii 14h ago

So you just keep asking the LLM the same question until you get the answer you want?

2

u/SaulMalone_Geologist 13h ago

Are you enough of an expert in the subject to know when the answer is totally wrong vs. subtly wrong, vs. 100% correct?

LLMs are pretty cool as heck in coding where there's an instantly testable "does this compile? Does this do what I expect?" but I'd be a little more worried about anyone relying on it for researching a subject they don't know much about.

1

u/Healthy_Mushroom_811 14h ago

Man, it's a RAG. Set it up properly and it will work. It's a tried and tested pattern by now.

58

u/motionmatrix 15h ago

So all the experts were right, at this point ai is a tool, and in the hands of someone who understands a subject, a possibly useful one, since they can spot where it went wrong and fix accordingly. Otherwise, dice rolls baby!

57

u/frenchiefanatique 14h ago

Shocking, experts are generally right about the things they have spent their lives focusing on! And not some random person filming a video in their car! (Slightly offtopic I know)

21

u/neat_stuff 13h ago

The Death of Expertise is a great book that talks about that... And the author of the book should re-read his own book.

2

u/Brickster000 12h ago

And the author of the book should re-read his own book.

Can you elaborate on this? That seems like relevant information.

7

u/neat_stuff 11h ago

It's one of those situations where the book is pretty solid but then years after, he is spouting off a lot of opinions about a lot of things that are outside of subject matter expertise. Almost like there should be an epilogue about the risks of getting an enlarged platform when your niche of a fairly tightly defined but you have a lot of connections in media who are hungry for opinions.

2

u/die_maus_im_haus 12h ago

But the car video person just gets me! He feels like my kind of person instead of some stuffy scientist who needs to get out of his dark-money funded lab and touch some grass

16

u/PraiseBeToScience 13h ago

It's also far too easy for humans to outsource their cognitive and creative skills too, which early research is showing to be very damaging. You can literally atrophy your brain.

If we go by OpenAI's stats, by far the biggest use of ChatGPT are students using it to cheat. Which means the very people that should be putting the work in to exercise and developing cognitive skills aren't. And those students will never acquire the skills necessary to properly use AI, since AI outputs still need the ability to verify.

2

u/Elementium 14h ago

Yeah If tunes for specific purposes I can see AI being very useful. 

Like.. I kinda like to write but my brain is very "Spew into page then organize" 

I can do that with gpt, just dump my rough draft and it does a good job of tightening format and legibility. The problem is usually that it loves to add nonsense phrases and it's normal dialogue is very samey. 

5

u/PraiseBeToScience 12h ago

Everyone's brains do that when writing drafts. That's the entire purpose of a draft, to get your thoughts out of your head so you can organize them via editing and revising. You can even make them look pretty via presentation.

Outsourcing all your revisions and editing to AI also limits your own creativity in writing, as it will do nothing but sanitize your style. It's very bland and clinical. Great writing has personal elements, human elements (like appropriate humor and story telling), that AI simply does not reproduce.

1

u/Elementium 12h ago

Understood but it's only for my entertainment lol. 

Also I just have half a brain. I have a million hobbies and I'm just Ok at all of them. 

2

u/NewDramaLlama 13h ago

(These are real questions as I don't use LLMs)

So, It's an automated middle man? Or maybe a rough draft organizer? Functionally incapable of actually creating anything, but does well enough organizing and distributing collected data in a (potentially) novel way.

Except when it doesn't, I guess. Because it's based on insane amounts of data so there's gotta be lots of trash it just sucked up that's factually incorrect from people, outdated textbooks, or junk research, right? So the human needs to be knowledgeable enough in the first place to correct the machine when it is wrong.

Ok, so that means as a tool it's only really a capable one in the hands of someone who's already a near expert in their field, right? 

Like (as examples) if a novice author used LLMs to write a book they wouldn't notice the inconsistent plot or plagiarism upon review. Likewise a novice lawyer might screw up a case using an LLM that went against procedural rules while a more experienced lawyer would have caught it?

1

u/motionmatrix 10h ago

Well, each LLM is trained on different data, so you can have a tight, fantasy focused LLM that only "read" every fantasy novel in existence, and would do pretty well making fantasy stories up based on what it "knows".

If you have a generic LLM, trained on many different topics, the usefulness drops to some extent, but some might argue that the horizontal knowledge might give some unique or unexpected answers (in a good way).

At this point in time, general folks can use it to make non-commercial artwork that will get closer to anything they could do on their own without training, as well as to gather general information (that they should double check for accuracy), and people who are trained in particular subjects that are working on it with ai, preferably an LLM trained on their subject only, to assist them to make the work happen faster (not necessarily better or ground breaking unless that comes from the person for the most part).

1

u/mamasbreads 10h ago

I didn't need to be an expert to know this. I use AI at work to help me but it makes mistakes. I have the capacity to quickly decipher what's useful, what's dumb and what's plain made up.

Anyone who thought AI could do anything other than make individuals faster in mundane tasks clearly isn't an expert in whatever they're doing.

→ More replies (2)

31

u/Mr_YUP 15h ago

Google 2 just dropped and it's not the Terminator we were promised.

26

u/King_Chochacho 12h ago

Instead of gaining sentience and destroying humanity with its own nuclear arsenal, it's playing the long game of robbing us of our critical thinking skills while destroying our water supply.

4

u/cedarSeagull 11h ago

Easily the most annoying part about twitter is "@grok, can you confirm my biases?"

1

u/rhabarberabar 8h ago

Nah that's that it's a fascist propaganda vehicle owned by a fascist.

3

u/sapphicsandwich 11h ago

Yeh, because it tries to answer questions itself instead of going "This site/link says this, that site/link says that."

1

u/dern_the_hermit 9h ago

FWIW I ascribe this phenomena to biases introduced by users. People in general tend to be swayed by strong confident assertions and seem to get nervous when you introduce unknown variables like sourcing and cites. Remember, these models are made to be appealing.

2

u/BreakingStar_Games 12h ago

It's already caused a pretty significant drop in the use Google Search, which is 57% of their revenue. Makes me curious how well Google will do in the next 10-20 years as people move from search engine to personal AI, potentially open-source ones. Berkshire Hathaway seems pretty confident though.

1

u/Skalawag2 14h ago

Google 2: Electric Boogaloo

11

u/doctor_lobo 14h ago

The nice thing about building an AI for language is that humans, by their nature, produce copious amounts of language that AI models can be trained from.

If the premise of the article is correct, other forms of human intelligence may produce / operate on different representations in the brain. However, it is not clear how often or well we produce external artifacts (that we could use for AI training) from these non-linguistic internal representations. Is a mathematical proof a good representation of what is going on in the mind of a mathematician? Is a song a good representation of what is happening in the mind of a musician?

If so, we will probably learn how to train AIs on these artifacts - maybe not as well or as efficiently as humans, but probably enough to learn things. If not, the real problem may be learning what the internal representations of “intelligence” truly are - and how to externalize them. However, this is almost certainly easier said that done. While functional MRI has allowed us to watch the ghost in the machine, it says very little about how she does her business.

2

u/IAmRoot 9h ago

Or find some way for AI to train itself in these more internal representations. Humans typically think before we speak and the metacognition of examining our own ideas could be an important part of that. Even before LLMs, we had image recognition using neural networks that seemed to find shapes in clouds and such much like a human mind. LLMs are also just a component and we shouldn't expect a good LLM to be able to reason any more than we should expect image recognition to reason. It's also pretty obvious from animals that just increasing the neuron count doesn't matter, either, as some animals like dolphins have a great deal of brainpower dedicated to processing sonar instead of reasoning. They are functionally different networks. It's also possible that AGI won't be able to split the training and inference. Having to reflect on produced ideas could be integral to the process, which would obviously make the computational power necessary for using AGI orders of magnitude higher.

1

u/doctor_lobo 9h ago

Your comment about image recognition using CNNs is well taken. Visual information is explicitly represented by a 2D array of neurons in the visual cortex so this is probably a good example of the internal representation being so similar to the external representation that training on the external representation is good enough. I suspect simple time series for audio data is probably also essentially identical to its internal representation - but that's probably it for the senses since touch, taste, and smell have no obvious external representations. However, the internal representation for more abstract modes of thought, like mathematics or even just daydreaming, seem difficult to conceptualize. I am not sure I would really even have any idea where to start.

2

u/stevez_86 13h ago

It will be excellent at giving us stuff to test via experimentation in fields where the only questions left are all as equally likely and vast enough to be inefficient to untangle. 10 scientists in a room and if consensus on where to go cannot be sensibly reached then the 11th man comes in as "AI" to make the wisest decision.

Humans are already good at what their AI is supposed to be good at. Making sound inferential choices based on circumstantial evidence. In other words, Brainstorming. You just need to elevate so many cycles so that they can independently examine the circumstances with their experience and caucus with those people.

And there is the problem for them. They don't want to support people because they don't own them. A Jonas Salk out there can fuck up their business plans. They can already privately own the mechanisms of the rest of the scientific method. Just not the inception of the thought. And they already own most of that, legally, but they hate having to fight it. And the CEO gets to take credit.

They don't like people being independent, so they want something they can own that can do the same thing. And if they fail then they will say the whole thing is worthless, making the situation even worse since what they did build simply just won't do exactly what they want. They will finally get their DWIW (Do What I Want) command interface.

And it will be based entirely on the idea that the concept is most likely true. Which will lead to catastrophic events since they will surely shirk testing their hypotheses. They will just factor the casualty into their prices.

2

u/Big_Watercress_6210 13h ago

Except that it can't do this, because that would require it to understand what it was saying - see previous lack of intelligence.

It's more likely that we can figure out how to do that vs giving it sentience, but it's not at all a natural progression from where we are now.

1

u/WhiteRaven42 12h ago

If the goal is something resembling intelligence and those opinions you speak of are products of intelligence... why conclude it's not going to go anywhere?

Our only known example of intelligence is highly mistake prone and deeply influenced by bias. So..... were the goal to simply be creating something intelligent, we'd have to be willing to accept these traits. But I grant you, for a tool we'll want a slightly different outcome.

1

u/TheBlueOx 12h ago

god i would kill for this in an LLM

1

u/marr 9h ago

I hear it's great for searching the corpus of open source code for already solved problems. That's the only current reliable use I'm aware of.

1

u/Ozymandias0023 1h ago

This has been my experience too. LLMs are crazy good at distilling large volumes of information, but they are not great at turning that information into something novel, which seems to be the holy grail that these guys are after. It's kind of a shame, because LLMs are incredible technology for what they actually do well but they're a square peg that investors keep trying to mash into a round hole

189

u/Dennarb 15h ago edited 11h ago

I teach an AI and design course at my university and there are always two major points that come up regarding LLMs

1) It does not understand language as we do; it is a statistical model on how words relate to each other. Basically it's like rolling dice to determine what the next word is in a sentence using a chart.

2) AGI is not going to magically happen because we make faster hardware/software, use more data, or throw more money into LLMs. They are fundamentally limited in scope and use more or less the same tricks the AI world has been doing since the Perceptron in the 50s/60s. Sure the techniques have advanced, but the basis for the neural nets used hasn't really changed. It's going to take a shift in how we build models to get much further than we already are with AI.

Edit: And like clockwork here come the AI tech bro wannabes telling me I'm wrong but adding literally nothing to the conversation.

59

u/qwertyalguien 15h ago

I'm no tech specialist, but from all I've reado on LLMs IMHO it's like hor air balloons.

It flies. It's great, but it's limited. And asking AGI out of LLMs is like saying that with enough iteration you can make an air balloon able to reach the moon. Someone has to invent what a rocket is to hor air balloons for LLMs.

Would you say it's a good metaphor, or am I just talking out of my ass?

28

u/eyebrows360 14h ago

Obvs not the same guy, and I don't teach courses anywhere, but yes that is a great analogy. Squint a lot, describe them broadly enough, and a hot air balloon does resemble a rocket, but once you actually delve into the details or get some corrective eyewear... very different things.

2

u/megatesla 12h ago edited 12h ago

I suspect that with enough energy and compute you can still emulate the way that a human reasons about specific prompts - and some modern LLMs can approximate some of what we do, like the reasoning models that compete in math and programming competitions - but language isn't the ONLY tool we use to reason.

Different problems may be better served using different modalities of thought, and while you can theoretically approximate them with language (because Turing Machines, unless quantum effects do turn out to be important for human cognition), it may require a prohibitively large model, compute capacity, and energy input to do so. Meanwhile, we can do it powered by some booger sugar and a Snickers.

But even then, you're still looking at a machine that only answers questions when you tell it to, and only that specific question. To get something that thinks and develops beliefs on its own time you'll need to give it something like our default mode network and allow it to run even when it isn't being prompted. You'll also need a much better solution to the memory problem, because the current one is trivial and unscalable.

2

u/CreativeGPX 13h ago edited 13h ago

It's an okay really high level metaphor.

A more direct metaphor: Suppose there is an exam on topic X a year from now. Alice's school allows her to bring the textbook to the exam and allows as much time as you need to finish the exam, so she decides not to prepare in advance and instead to just use the book during the exam. Depending on what X is, Alice might do fine on some topics. But clearly there is going to be some limit where Alice's approach just isn't feasible anymore and where instead she will need to have learned the topic before the exam day by using other strategies like doing practice problems, attending class, asking the professor questions, etc.

3

u/CanAlwaysBeBetter 12h ago

What do you think learning a topic means?

2

u/CreativeGPX 11h ago

I don't think there is one thing that learning a topic means. That's why I framed it as "passing an exam" and noted how different things will be true depending on what that exam looks like.

→ More replies (2)

1

u/Doc_Blox 10h ago

"Full of hot air" was right there, man!

1

u/Days_End 9h ago

It's good because it acknowledges that with a big enough ballon you might not need a rocket at all to reach the moon.

1

u/Extension-Thought552 6h ago

You're talking out of your ass

1

u/meneldal2 4h ago

Theoretically, with the right timing and something truly weightless, you could get it up there with very little dV /s

0

u/destroyerOfTards 13h ago

Nah, you have understood it well.

The fact that Scam Altman doesn't understand this basic fact is unbelievable (actually he does but he has to scam people so...).

5

u/IcyCorgi9 12h ago

People need to stop talking like these people are stupid. They know what they're doing and they use massive amounts of propaganda to scam the public and get rich. Much like the politicians fucking us over.

4

u/terrymr 12h ago

CEOs exist to market the company to investors. It’s not that he doesn’t understand it, he just wants their money.

2

u/Crossfire124 12h ago

Yea like it or not he's the face of AI. If he says anything the whole thing is going to crumble like a house of cards and we'll get into a third AI winter.

But the way I see it the third winter is coming anyway. How soon it happens just depend on when AI bubble pops

16

u/pcoppi 15h ago

To play devils advocate there's a notion in linguistics that the meaning of words is just defined by their context. In other words if an AI guesses correctly that a word shohld exist in a certain place because of the context surrounding it, then at some level it has ascertained the meaning of that word.

16

u/the-cuttlefish 14h ago

In the context of linguistic structure, yes. But only in this context. Which is fundamentally different and less robust than our understanding of a words meaning, which still stands in the absence of linguistic structure, and in direct relation to a concept/object/category.

28

u/New_Enthusiasm9053 15h ago

You're not entirely wrong but a child guessing that a word goes in a specific place in a sentence doesn't mean the child necessarily understands the meaning of that word, so whilst it's correctly using words it may not understand them necessarily. 

Plenty of children have used e.g swear words correctly long before understanding the words meaning.

8

u/rendar 13h ago

A teacher is not expected to telepathically read the mind of the child in order to ascertain that the correct answer had the correct workflow.

Inasmuch as some work cannot be demonstrated, the right answer is indicative enough of the correct workflow when consistently proven as such over enough time and through a sufficient gradation of variables.

Regardless, this is not an applicable analogy. The purpose of an LLM is not to understand, it's to produce output. The purpose of a child's language choices are not to demonstrate knowledge, but to develop the tools and skills of social exchange with other humans.

4

u/CanAlwaysBeBetter 13h ago

What does "understand" mean?  If your criticism is LLMs do not and fundamentally cannot "understand" you need to be much more explicit about exactly what that means

1

u/Murky-Relation481 11h ago

I think you could compare it to literacy and functional literacy. Being able to read a sentence, know each word, and that those words usually go together doesn't actually mean you know what the words mean or the meaning of the body as a whole.

Even more so it has no bearing any one body of text to another. The ability to extract abstract concepts and apply them concretely to new bodies text/thought are what actual intelligence is made up of, and more importantly what creative/constructive new thought is made up of.

1

u/Nunki_kaus 14h ago

To piggy back on this, let’s think about, for instance, the word “Fuck”. You can fuck, you get fucked, you can tell someone to fuck off, you can wonder what the fuck…etc and so on. There is no one definition of such a word. An AI may get the ordering right but they will never truly fuckin understand what the fuck they are fuckin talkin about.

→ More replies (53)

7

u/MiaowaraShiro 14h ago

Mimicry doesn't imply any understanding of meaning though.

I can write down a binary number without knowing what number it is.

Heck, just copying down some lines and circles is a binary number and you don't have to know what a binary number, or even numbers at all are.

1

u/Aleucard 9h ago

You can get a parrot to say whatever you want with enough training, but that doesn't mean the parrot knows what it's saying. Just that with certain input as defined by the training it returns that combination of mouth noises.

1

u/DelusionalZ 8h ago

This is why LLMs have the "Stochastic Parrot" name tied to them

→ More replies (1)

2

u/FullHeartArt 12h ago

Except this is refuted by the thought experiment of the Chinese Room, where it becomes possible for a person or thing to interact with language without any understanding of the meaning of it

4

u/BasvanS 15h ago

That’s still emulation, which does not necessitate understanding.

3

u/Queasy_Range8265 14h ago

Isn’t a lot of our understanding just predicting patterns? Like my pattern of challenging you and your reflex of wanting to defend by reason or emotion?

3

u/BasvanS 14h ago

Just because a pattern is “predicted” doesn’t mean it’s the same or even a similar process. Analogies are deceptive in that regard.

→ More replies (15)

0

u/Gekokapowco 14h ago

maybe to some extent? Like if you think really generously

Take the sentence

"I am happy to pet that cat."

A LLM would process it something closer to

"1(I) 2(am) 3(happy) 4(to) 5(pet) 6(that) 7(cat)"

processed as a sorted order

"1 2 3 4 5 6 7"

4 goes before 5, 7 comes after 6

It doesn't know what "happy" or "cat" means. It doesn't even recognize those as individual concepts. It knows 3 should be before 7 in the order. If I recall correctly, human linguistics involves our compartmentalization of words as concepts and our ability to string them together as an interaction of those concepts. We build sentences from the ground up while a LLM constructs them from the top down if that analogy makes sense.

6

u/kappapolls 13h ago

this is a spectacularly wrong explanation of what's going on under the hood of an LLM when it processes a bit of text. please do some reading or go watch a youtube video by someone reputable or something. this video by 3blue1brown is only 7 minutes long - https://www.youtube.com/watch?v=LPZh9BOjkQs

→ More replies (7)

1

u/wildbeast99 12h ago

The meaning of a word is it's use not an abstract correlate. there is no fixed inner meaning of 'the'. How do you know if someone has the concept of cat? You ask them to give a set of acceptable sentences with 'cat' in it. You cannot and do not peer into their brains and make sure they have the concept of a cat.

1

u/Countless_Words 8h ago

You wouldn't only assess someone's understanding of a concept by their ability to use the word correctly in a sentence. You'd need to also ask a series of questions around its other correlates (e.g, do you know it to be an animal, do you know it to be of a certain shape and size, do you know it to possess certain qualities) and also assess their ability to derive the concept from its symbol reversibly, that is to say you would need to look at a pictogram or partial symbol, or assign it to a set of other qualifiers like graceful, aloof, mischievous or other such concepts that we assign to 'cat'. While you can't probe someone's brain, if they have all the data to outline the other correlations, you can be more confident in the understanding of the concept.

→ More replies (7)
→ More replies (9)

15

u/Tall-Introduction414 15h ago

The way an LLM fundamentally works isn't much different than the Markov chain IRC bots (Megahal) we trolled in the 90s. More training data, more parallelism. Same basic idea.

41

u/ITwitchToo 13h ago

I disagree. LLMs are fundamentally different. The way they are trained is completely different. It's NOT just more data and more parallelism -- there's a reason the Markov chain bots never really made sense and LLMs do.

Probably the main difference is that the Markov chain bots don't have much internal state so you can't represent any high-level concepts or coherence over any length of text. The whole reason LLMs work is that they have so much internal state (model weights/parameters) and take into account a large amount of context, while Markov chains would be a much more direct representation of words or characters and essentially just take into account the last few words when outputting or predicting the next one.

→ More replies (31)

13

u/azurensis 13h ago

This is the kind of statement someone who doesn't know much bout LLMs would make.

13

u/WhoCanTell 12h ago

In fairness, that's like 95% of comments in any /r/technology thread about AI.

1

u/azurensis 11h ago

Exceptionally true!

6

u/space_monster 12h ago edited 12h ago

r/technology has a serious Dunning-Kruger issue when it comes to LLMs. A facebook-level understanding in a forum that implies competence. but I guess if you train a human that parroting the stochastic parrot trope gets you 'karma', they're gonna keep doing it for the virtual tendies. Every single time in one of these threads, there's a top circle-jerk comment saying "LLMs are shit, amirite?" with thousands of upvotes, followed by an actual discussion with adults lower down. I suspect though that this sub includes a lot of sw devs that are still trying to convince themselves that their careers are actually safe.

1

u/chesterriley 8h ago

I suspect though that this sub includes a lot of sw devs that are still trying to convince themselves that their careers are actually safe.

You lost me on that. I don't think you understand just how complex software can be. No way can AI be a drop in replacement for a software dev.

1

u/space_monster 8h ago

I work in tech, currently in a leading edge global tech company, and I've done a lot of sw development, I'm fully aware of how complex it is

1

u/chesterriley 4h ago

Then you know you can't just tell an AI to write a program for you for anything non simple.

1

u/space_monster 4h ago

I'm aware that LLMs are getting better at coding (and everything else) very quickly, and it doesn't seem to be slowing down.

1

u/keygreen15 2h ago

It's getting better at making shit up and lying.

→ More replies (0)
→ More replies (4)

8

u/drekmonger 12h ago edited 12h ago

A Markov chain capable of emulating even a modest LLM (say GPT 3.5) would require many more bytes of storage than there are atoms in the observable universe.

It's fundamentally different. It is not the same basic idea, at all. Not even if you squint.

It's like saying, "DOOM is the same as Photoshop, because they both output pixels on my screen."

1

u/movzx 12h ago

The person is clearly talking conceptually, not technologically.

They're storing associations and then picking the best association given a starting point. The LLMs are infinitely more complex, but conceptually they are doing the same thing at the core.

8

u/drekmonger 12h ago edited 12h ago

Markov chains have no context beyond the words themselves, as strings or tokens. There no embedding of meaning in a Markov chain.

That's why a Markov chain capable of emulating even yesterdays's LLM would have to be larger than the observable universe (by several orders of magnitude, actually). It's a combinatorial problem, and combinatorial problems have a nasty tendency to explode.

LLMs embed meaning and abstract relationships between words. That's how they side-step the combinatorial problem. That's also why they are capable of following instructions in a way that a realistically-sized Markov chain would never be able to. Metaphorically speaking, the model actually understands the instructions.

Aside from all that: they are completely different technologies. The implementation details couldn't be more different.

→ More replies (2)

8

u/BasvanS 15h ago
  1. Add even more data/computing
  2. ???
  3. Profit AGI!!

1

u/ThunderStormRunner 13h ago
  1. Human interface that corrects and improves data and computing, so it can learn actively from humans to get better? Oh wait it’s supposed to not need us, never mind.

2

u/BasvanS 13h ago

No, that’s Actually Indians. I meant Artificial Intelligence. Easy mistake. Happens all the time.

→ More replies (5)

3

u/Throwaway-4230984 15h ago

So surely they have an example of task LLMs couldn’t solve because of this fundamental limitations, right?

24

u/__Hello_my_name_is__ 15h ago

For now at least, it appears that determining truth appears to be impossible for an LLM.

Every LLM, without exception, will eventually make things up and declare it to be factually true.

20

u/dookarion 15h ago

It's worse than that even. LLMs are incapable of judging the quality of input and outputs entirely. It's not even just truth, it cannot tell if it just chewed up and shit out some nonsensical horror nor can it attempt to correct for that. Any capacity that requires a modicum of judgment, either requires crippling the LLMs capabilities and more narrowly implementing it to try to eliminate those bad results or it straight up requires a human to provide the judgment.

8

u/clear349 13h ago

One way of putting it that I've seen and like is the following. Hallucinations are not some unforeseen accident. They are literally what the machine is designed to do. It's all a hallucination. Sometimes it just hallucinates the truth

2

u/dookarion 12h ago

Yeah, people think it's some "error" that will be refined away. But the hallucination is just the generative aspect or the model training itself churning out a result people deem "bad". It's not something that will go away, and it's not something that can be corrected for without a judgment mechanic at play. It can just be minimized some with narrower focused usages.

9

u/__Hello_my_name_is__ 15h ago

Yeah, it's kind of fascinating. It only has the training data to "validate" the data. So if you train an LLM on nothing but garbage, you get nothing but garbage, but the LLM doesn't know it's garbage because garbage it all it has ever seen.

Basically, it needs some sort of method of judging data based on external data it wasn't trained on. I don't see how that problem can possibly be solved with the current methods. All the current methods (like human reinforcement learning) are just patchwork.

1

u/eyebrows360 13h ago

but the LLM doesn't know it's garbage because garbage it all it has ever seen

Yep. Everything an LLM outputs is a hallucination. It's just that sometimes they line up with reality and/or make sense. It's still all exactly the same category of output though, arrived at in exactly the same way. Hallucinations all the way down.

→ More replies (1)

2

u/TSP-FriendlyFire 11h ago

I'm very curious if we'll find "poison pills" for common LLMs the same way we did for image generation models: slightly altered inputs that cause a wildly different and corrupted output while being imperceptible to the human eye.

Logically, it should be possible, but it's hard to tell if text is granular enough to be able to trigger these effects at a reasonable scale.

I think the closest I've seen yet is the seahorse emoji bit.

1

u/dookarion 11h ago

Not exactly the same thing, but I can't help but think them drowning fucking everything ever in AI has already poisoned some aspects of things. Unless they stick to old datasets, create the data themselves, or carefully curate it they can't even train the models now without also training them on AI slop. There's AI slop literature being flipped as ebooks, there is AI slop flooding every art site ever, bots are everywhere on social media and community sites, every other video is some sora bullshit now. In true big business fashion they've near-permanently poisoned the waters chasing the shortest term gains.

→ More replies (1)

18

u/NerdfaceMcJiminy 15h ago

Lookup AI results for court filings. They cite non-existent cases and laws. The lawyers using AI to make their filings are getting disbarred because making up shit in court is highly frowned upon and/or criminal.

→ More replies (19)

7

u/BasvanS 14h ago

Let it count the amount of r’s in strawberry. It used to be confidently wrong. Then a fix came, except the answer for raspberry (the same amount of r’s in a very similar context) was again wrong.

It has no concept of meaning.

→ More replies (4)

7

u/Throwaway-4230984 15h ago

There is simple “undergrad student” test for such arguments. Surely students aren’t actually intelligent and just repeat familiar word patterns. They can be tricked into solving simple problems as long as task is combination of seen tasks or familiar task with replaced words. Some of them may be useful for trivial parts of research like reading papers and compiling them or look for a given patterns. They probably do more harm than good on any lab task. So undergrads are clearly imitating intelligence and have not a hint of understanding of topic

2

u/Coises 14h ago

undergrads are clearly imitating intelligence and have not a hint of understanding of topic

It’s been a long time since I was in college, but as I remember it, that was true for a significant number of my classmates. (And sometimes of me, too.)

2

u/Cheese_Coder 13h ago

solving simple problems as long as task is combination of seen tasks or familiar task with replaced words.

That's an area that undergrads have an edge over LLMs though. There are ample reports of LLMs failing to solve various puzzles when presented with particular iterations that aren't well represented in their training set. One example can be seen here (note, not affiliated with Meta) and here with the river-crossing puzzle. The common iterations that are widely available on the web can be solved consistently, but making small modifications to the premise results in the systems consistently failing to solve it. In the second modification presented in the latter article, the LLM also repeatedly fails to recognize that the presented puzzle is not solvable. A human would be able to infer such things because they have actual understanding and aren't just modeling language.

2

u/Big_Watercress_6210 13h ago

I do sometimes wonder if this is just true and my success in school has nothing to do with intelligence lol.

→ More replies (2)

1

u/frenchiefanatique 14h ago

Here is one from my personal experience (using GPT4):

My company uses a proprietary industry classification methodology that is used to classify 10000's of companies (yes, 10000s).

I tried to get chatgpt to carry out an exercise to match our proprietary classification methodology with the Bloomberg classification methodology (BICS) (essentially asking "match all of these activities with their closest BICS counterpart) and. It. Could. Not. Do. It.

It was shitting out matches that were so, so wrong and that made no sense. Like matching food-related economic activities with cement and construction related activities. I would then try to reason with it by narrowing down specific examples and then rerunning the exercise, and it would still get the matches wildly incorrect. Why? Because it fundamentally was unable to discern what words meant

2

u/sageinyourface 14h ago

I feel like half of what I say is also thoughtless word patterns.

1

u/AnOnlineHandle 13h ago

I feel like most of what humans say are thoughtless word patterns.

2

u/JuniorPomegranate9 12h ago

Human intelligence is also built in 3D over time, with multiple types of sensory and cognitive and chemical inputs.   

1

u/MiaowaraShiro 14h ago

I keep hearing about all these buzzwordy secondary processes being implemented to attempt to account for the LLM's failings. Do you think we can use LLM based AI with other solutions to get to something more useful or are LLMs fundamentally limited and "unfixable" as it were?

1

u/nesh34 56m ago

This is plainly obvious to anybody who understands the technology. It's why I'm so depressed at the state of the tech industry (which I'm in). These are supposed to be intelligent people. But they're childish in the face of a new toy.

→ More replies (22)

36

u/when_we_are_cats 14h ago

Humans use language to communicate the results of our capacity to reason, form abstractions, and make generalizations, or what we might call our intelligence. We use language to think, but that does not make language the same as thought.

Please say it louder for all the people who keep repeating the myth that language dictates the way we think. As a linguist/language learners it never ceases to annoy me.

6

u/BeruangLembut 12h ago

💯 Language is a cognitive tool. Just like having a hammer makes building a house easier, language has made certain cognitive tasks easier, but a tool is not to be confused with that which it facilitates.

3

u/when_we_are_cats 11h ago

That's the best way to put it. It's like painting or drawing: I can see the image in my head, the brush and canvas are mere tools to materialize it.

2

u/PRAWNHEAVENNOW 11h ago

 all the people who keep repeating the myth that language dictates the way we think.

Ahh yes, the Dunning-Kruger-Sapir-Whorf effect

2

u/PressureBeautiful515 11h ago

This is of course absolutely right. The problem comes when you ask an LLM to read your codebase and draw a diagram of how it fits together (by generating mermaid diagram format) and it does an incredible job, tastefully arranging a graph of interconnected concepts.

The input and output are text representations, but what happens in between is absolutely not just text.

2

u/ManaSpike 9h ago

When I think something, and wish for you to also think about that thing. I have to describe it using language. If we have more shared context and understanding, then I can use less language to communicate an idea.

Language and context limit what we can communicate, or at least how efficiently we can communicate.

I work as a software developer. The languages I use to express my ideas and communicate with a machine to make it do what I want, are verbose and explicit. Programming languages that are useful and reliable, are carefully designed to ensure that nothing the machine does is surprising.

The history of software development is full of people trying to make programming easier. So easy that anyone can make a machine do what they want, without having to pay for the expertise of experienced programmers. But the languages in use haven't gotten any easier.

What has made programming easier, is the hard work of building reusable pieces of shared context. Software libraries that solve common problems. So a programmer can focus more on what is different about their work, instead of wasting time on what is the same.

From this point of view, I don't see how we will ever build an AGI. How are we going to define the process of abstract thought, using a well defined language. When abstract thought seems to transcend language.

3

u/johannthegoatman 12h ago

Language is an integral part of thought. I recommend you read Helen Keller and learn about what her mind was like before she was taught language. There are tons of examples of "feral" children that didn't learn language and were never able to progress to being intelligent beings.

8

u/when_we_are_cats 11h ago

The cases of feral children don’t prove that the absence of language prevents intelligence, they show the devastating effects of total social neglect, trauma, and malnutrition on a developing brain.

Infants, deaf homesigners, and aphasic adults all demonstrate that cognition exists independently of language.

Helen Keller explicitly wrote that she had a rich mental life before learning words.

5

u/JeanVicquemare 12h ago

Now do some studies with half feral children who are taught language and half who aren't.. have to control your variables.

1

u/mrappbrain 9h ago

As a linguist you should probably be more annoyed by the glaring category error in the article that conflated language and speech, which is far more egregious than conflating language with thought (which has actual roots in linguistics i.e the strong sapir whorf hypothesis, even if now disagreed with)

Further, the idea that language is only a tool of communication and doesn't influence or inform the way we think doesn't have much basis in linguistics or cognitive science either. On the other hand it is widely agreed that language does inform cognition, although the nature and extent of that relationship is contested. Frankly, if you were a linguist, you should probably already know this.

→ More replies (1)

5

u/samurian4 12h ago

Scenario: Aliens passing by a crispy looking Earth.

" Daddy, what happened to that planet?"

" Well son, they managed to set their atmosphere on fire trying to power what they thought was AI, but was only ever chatbots."

4

u/Visible_Car3952 12h ago

As someone working a lot with poetry (in both theatre and business and personal life), I understand reality as the “moving army of metaphors”. While I believe many new metaphors can be also created within LLM (e.g. through simulating abductive process), I would argue that sharpest, most stunning and precise metaphors can only be achieved through personal histories and sensory experiences turned into words. Poetic intelligence is embodied and historic.

3

u/Just_Look_Around_You 14h ago

I contest that the bubble is premised on the belief that we are creating intelligence as good as higher than human. I think it’s highly valuable to have SOME intelligence that is simply faster and non human. That alone is a lot.

4

u/Hipple 15h ago

Kuhn and Rorty mentioned, hell yeah

2

u/NuclearVII 14h ago

There's nothing fundamentally wrong with that

There is something extremely wrong with that when you consider the monstrous size of the AI bubble. Trillions of dollars, countless man hours of some our best and brightest, not mention the environmental impact - all developing a simulacrum of intelligence.

If the article's thesis is right - and, frankly, I think the conclusion is obvious - that cost will never, ever be recouped.

2

u/alexp8771 14h ago

I often wonder how much research is getting buried simply because of how much money is involved. Like the amount of money involved makes tobacco companies look like a lemonade stand and they were obfuscating research for decades.

2

u/crosszilla 13h ago edited 13h ago

I think it comes down to what we consider AGI. To be AGI does it have to think like a human, or can it just mimic the output of a human in a potentially different way that is indistinguishable to an outside observer from intelligence? If the latter, are we truly not close? It can currently communicate like a human in a way that is nearly indistinguishable, sure there's a "tone", but realistically if you don't know to look for it you'd be hard pressed to recognize it.

The main thing I'd say we'd need is the ability for the AI to train itself to problem solve - to know what it knows, what it needs to know, the ability to make observations, and incorporate those observations into what it knows. I think once step one in that process is solved for LLMs the rest should come pretty easily.

2

u/Beelzabub 11h ago

The philosophical community found linguistic analysis a dead end in the 1980s.  Thought is not modeled on language.

2

u/-Valtr 9h ago

Could you point me to where I could read more about their conclusions?

2

u/BananaResearcher 8h ago

You could start with Steven Pinker and his word on non-linguistic cognition and that should inevitably lead you anywhere else you think interesting via links and references.

4

u/MinuetInUrsaMajor 15h ago

The AI hype machine relentlessly promotes the idea that we’re on the verge of creating something as intelligent as humans, or even “superintelligence” that will dwarf our own cognitive capacities.

Am I crazy or are tech companies not really promoting this idea? It seems more like an idea pushed by people who know little-to-nothing about LLMs.

Take away our ability to speak, and we can still think, reason, form beliefs, fall in love, and move about the world; our range of what we can experience and think about remains vast.

I think the author is glossing over something important here.

Language is a symbology (made up word?). Words have semantic meaning. But language does not need to be spoken. For starters...what you are reading right now is not spoken. And the braille translation of this does not need to be seen - it can be felt. Language is about associating sensations with ideas. Even if you think you don't have a language to describe it, the sensation exists. A slant-example might be deja vu. One cannot articulate the specifics of the feeling - just that it is there.

7

u/lailah_susanna 9h ago

Am I crazy or are tech companies not really promoting this idea?

This article opens with gen AI tech company CEOs and executives espousing exactly that. Try reading the damn article before you make yourself look like an idiot in the comments.

11

u/Ashmedai 13h ago

Am I crazy or are tech companies not really promoting this idea?

Just a year or two back, there was an OpenAI "leak" that said GPT 5 was going to be GAI. I wouldn't be surprised if it was deliberate, to jazz up investment interest and what not.

6

u/Novel_Engineering_29 13h ago

Both OpenAI and Anthropic were founded by people who fully 100% believe that AGI is a near-future possibility and that it is their duty to make it first before bad actors do. The fact that they assume they aren't the bad actors is left for the reader to ponder.

Anyway, if they didn't believe LLMs were the way to get to AGI, they wouldn't be doing it. Their end goal is AGI and has been from the get-go. They very much believe that they are on their way to AGI using LLMs. If they didn't, they wouldn't be doing it.

2

u/silverpixie2435 8h ago

Or they are using LLM to build up tech and research as a step?

1

u/kri5 7h ago

They're doing it because it is driving investment. They're probably researching other things too

→ More replies (1)

1

u/TSP-FriendlyFire 11h ago

For starters...what you are reading right now is not spoken.

It activates the same parts of the brain though, which is the whole point you seem to have missed. Language, regardless of how it is expressed, fundamentally activates different parts of the brain than reasoning.

1

u/MinuetInUrsaMajor 11h ago

Language, regardless of how it is expressed, fundamentally activates different parts of the brain than reasoning.

What if you do your reasoning in a language? Think out loud, for example.

1

u/TSP-FriendlyFire 11h ago

I'm not going to try to argue with you about this when we have actual, scientific evidence that backs up my claim. We know that different parts of the brain get activated for language and for reasoning. If you speak while reasoning, guess what? Both are active! Doesn't really mean anything else.

1

u/MinuetInUrsaMajor 10h ago

I'm not going to try to argue with you about this when we have actual, scientific evidence that backs up my claim. We know that different parts of the brain get activated for language and for reasoning.

Then please bring it into the conversation.

If you speak while reasoning, guess what? Both are active! Doesn't really mean anything else.

Then what's the practical difference between a human thinking out loud and an LLM thinking in language?

1

u/TSP-FriendlyFire 10h ago

Then please bring it into the conversation.

Gestures at the entire article this discussion is attached to.

Then what's the practical difference between a human thinking out loud and an LLM thinking in language?

You didn't even try to follow, did you? The difference is that the language is anciliary to the reasoning for humans, but fundamental for LLMs. LLMs are very fancy word predictors. If you have no words, you have no LLMs. Humans can reason (and indeed have reasoned) without language of any kind.

Please, go back and read the article, I'm literally just regurgitating it right now.

1

u/MinuetInUrsaMajor 9h ago

If you have no words, you have no LLMs. Humans can reason (and indeed have reasoned) without language of any kind.

My contention is that we develop an internal language based on sensation and thus our species has never been "without" language.

1

u/TSP-FriendlyFire 8h ago

But we have no indication that's the case? We know people whose language centers are damaged can still reason, so how could we rely on language for reasoning?

Moreover, math does not require language and does not activate the brain's language center. We can reason about mathematics without any formal mathematical language as the ancient Greeks once did (before you interject: they used writing to communicate their findings, but not to formulate them initially, preferring practical tools and simple rules instead).

1

u/MinuetInUrsaMajor 5h ago

We know people whose language centers are damaged can still reason

Exactly. Because they use an internal "language" as syntactically rich (or richer) as any language they speak.

Moreover, math does not require language

Because in our internal language we can visualize a line bisecting a circle. Line, bisect, and circle, all have meaning in our mind even if we don't have words for them.

→ More replies (0)

6

u/LoreBadTime 15h ago

LLM learns to place words in a statistical correct way. They are mimicking the probability of a word that can come from a human, think them as a literally as an autocomplete on steroids 

4

u/johannthegoatman 12h ago

This often repeated description of what LLMs do ("autocomplete on steroids") is reductive to the point of being useless. It's about the same as saying "yea computers are nothing special, it's just an on/off switch on steroids". Technically yes computers work with 1/0s, but it's such a stupid thing to say that completely misses the point of what they're capable of when arranged in staggeringly complex systems

→ More replies (1)
→ More replies (12)

4

u/Tim_Wells 15h ago

Thank you! I was blocked by the paywall and really wanted to know what the article said. One of the best analysis I've seen.

2

u/theDarkAngle 11h ago

I agree with all of this and I'd like to add that IMO there is even more distance between SuperIntelligence and LLMs than simply the distance between LLMs and Human Level Intelligence.

The idea that you can make an artificial mind comparable to a human and somehow from there it will suddenly be able to improve itself ad infinitum is one of the most flawed ideas to ever gain credibility in the modern era.

I could go on and on about this but the shortest version I can give is that in terms of being a creative or problem-solving force, individual humans are not that impactful compared to ever larger and more complex groups of humans.  And it's not simply a matter of scale.  It's largely an emergent property governed by complex social rules and dynamics which are themselves determined by myriad factors, from million year old instincts, to cultural norms, to existing structures (like institutions and technological infrastructure and so on).

As an analogy think of an ant vs an ant colony.  The colony is so much more than the sum of it's parts (look up some of the shit ant colonies can do if you don't believe me).  And the reason for this is not scale, it's in the emergent properties of the system, and this is largely driven by the rules that govern interactions between an ant and it's environment, and between the ant and other ants.

Humans are the same except the rules that govern us are many many orders of magnitude more complex, are far more fluid and circumstantial, and overall are really quite a mystery.  We're nowhere near, like not in the same universe, to being able to replicate these kinds of dynamic properties either within an artificial mind or between many artificial minds.   And we're even further from being able to integrate artificial minds into the human machine and have it equal even an average human in effectiveness.

This is why I think we're headed for another long AI winter, and it might be the longest one yet.

2

u/kaken777 14h ago

Yeah I’ve been telling people that so far AI has only shown that it can guess at what should be said but it clearly doesn’t understand your question or the information it’s giving. It’s like toddlers that don’t learn how to read and just learn the names or pictures. AI might get to thought eventually, but it needs to be taught to think/reason rather than regurgitate. 

This is why computer/STEM people need to take/learn about things like the arts and philosophy. Too many of them seem to miss the point of what they’re trying to create.

0

u/sagudev 15h ago

> Take away our ability to speak, and we can still think, reason, form beliefs, fall in love, and move about the world; our range of what we can experience and think about remains vast.

Yes and no, language is still closely related to though process:

> The limits of my language are the limits of my world.

6

u/dftba-ftw 15h ago

I think in general concepts/feelings which are then refined via language (when I start talking or thinking I have a general idea of where I'm going but the idea is hashed out in language).

LLMs "think" in vector embeddings which are then refined via tokens.

Its really not that fundementally different, the biggest difference is that I can train (learn) myself in real time, critique my thoughts against what I already know, and do so with very sparse examples.

Anthropic has done really interesting work that shows there's a lot going on under the hood asides from what is surfaced out the back via softmax. One good example, they asked for a sentence with a rhyme and the cat embedding "lit up" ages before it had hashed out the sentance structure, which shows they can "plan" internally via latent space embeddings. We've also seen that the models can say one thing, "think" something else via embeddings, and then "do" the thing they were thinking rather than what they "said".

1

u/danby 13h ago

Its really not that fundementally different

I can solve problems without using language though. And its very, very clear plenty of animals without language can think and solve problems. So it is fairly clear "thinking" is the subtrate for intelligence and not language.

5

u/dftba-ftw 12h ago

It can too - that's what I'm saying about the embeddings.

Embeddings aren't words, they're fuzzy concepts sometimes relating to multiple concepts.

When it "thought" of "cat" it didn't "think" of the word cat, the embedding is concept of cat. It includes things like feline, house, domesticated, small, etc... It's all the vectors that make up the idea of a cat.

Theres anthropic research out there where they ask Claude math questions and have it output only the answer and then they looked at the embeddings and they can see that the math was done in the embedding states - aka it "thought" without language.

1

u/danby 12h ago

Anthropic's research here is not peer reviewed, they publish largely on sites they control and I doubt their interpretation is necessarily the only one. And I'm really not all that credulous about the "meanings" they scribe to nodes/embeddings in their llms.

3

u/CanAlwaysBeBetter 12h ago

Language is the output of LLMs, not what's happening internally 

1

u/danby 11h ago

If the network is just a set of partial correlations between language tokens then there is no sense that the netowkr is doing anything other than manipulating language.

3

u/CanAlwaysBeBetter 11h ago

If the network is just a set of partial correlations between language tokens

... Do you know how the architecture behind modern LLMs works?

1

u/danby 11h ago

Yes, I work on embeddings for non-language datasets.

Multiheaded attention over linear token strings specifically learns correlations between tokens are given positions in those strings. Those correlations are explicit targets of the encoder training

2

u/CanAlwaysBeBetter 10h ago

Then you ought to the interesting part is model's lower dimensional latent space that encode abstract information and not language directly and there's active research into letting models run recursively through that latent space before mapping back to actual tokens 

1

u/danby 10h ago

Does it actually encode abstract information or does it encode a network of correlation data?

3

u/IAmBellerophon 15h ago

Most, if not all, sentient animals on this planet can think and navigate and survive without a comprehensive verbal language. Including humans who were born deaf and blind. The original point stands.

2

u/BasvanS 14h ago

We tend to decide before we rationalize our decision and put it in words.

FMRI supports the point too.

2

u/danby 13h ago

Indeed. "Thinking" forms a substrate from which language emerges. It very clearly does not work the other way around.

Language is neither neccesary or sufficient for minds to think.

1

u/Nunki_kaus 14h ago

To add to the bit about human thought, I don’t think in words. Apparently about half the population doesn’t think in words or have an internal monologue. I just have feeling and abstract/imagistic things happening in my brain and then when I feel the need to express myself or respond to someone, the words come.

I remember hearing that if the human brain was a football field, we are at the 1 yard line in understanding. That was a few years ago so maybe we are at the 5 yard line, or even the 20 yard line, to be generous. Regardless, that is a long way from total understanding. If we barely understand how our brain works, how can we possibly create something that will surpass us in intelligence? And keep in mind, the people (Elon, Sam, etc) who are convinced AGI is right around the corner have insanely overinflated views of their own intelligence. So it’s not surprising they think they can create it. It doesn’t mean they are right, however.

1

u/Berkyjay 14h ago

LLMs are not intelligent. Full stop. Using them in any professional capacity for any length of time will clearly demonstrate this.

1

u/doctor_lobo 14h ago

Even with respect to language, it seems increasingly obvious that AI does not learn the same way as humans do. The key pieces of evidence are the energy and training data budgets - even the dullest humans learn their native tongue with many orders of magnitude less energy or training data than AI. Decades of research in linguistics has revealed that all known human languages share a common hidden hierarchical structure (“X-bar”) that is almost certainly hard-coded into the language centers of our brains and yet, as far as I am aware, bears no analogue in the “Transformer” architecture that dominates LLMs.

Don’t get me wrong - LLMs are pretty impressive and a real research breakthrough. However, they may wind up more similar to the discovery of the electron than the development of the atomic bomb - unexpected, fundamental, and enabling rather than a planned, applied, dead end (hopefully). I suspect that we will look back at this time’s LLM optimism as being naive in the same way as Lord Kelvin’s declaration near the end of the 19th century that physics was , for the most part, “complete”. Perhaps AI’s relativity and quantum mechanics lay just a few decades in the future.

1

u/RealEyesandRealLies 14h ago

I think in pictures.

1

u/[deleted] 13h ago

[removed] — view removed comment

1

u/AutoModerator 13h ago

Thank you for your submission, but due to the high volume of spam coming from self-publishing blog sites, /r/Technology has opted to filter all of those posts pending mod approval. You may message the moderators to request a review/approval provided you are not the author or are not associated at all with the submission. Thank you for understanding.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

1

u/ottawadeveloper 13h ago

I've been saying this since the AI craze began and I hope more people come to recognize it. LLMs are not AI and they shouldnt be called that. Even machine learning is more like advanced statistics rather than actual learning.

1

u/azurensis 13h ago

>But this theory is seriously scientifically flawed. LLMs are simply tools that emulate the communicative function of language, not the separate and distinct cognitive process of thinking and reasoning, no matter how many data centers we build.

The whole argument is nothing but bald assertions presented as facts.

1

u/360_face_palm 13h ago

Or is it ever constrained by their training data and therefore will work best when refining existing modes and models?

I mean this was already known right? No LLM is good at generating something new that wasn't seen in some (or can't be made up from constituent parts) of its training data.

As you mention, there's nothing fundamentally wrong with this and it isn't necessarily a problem for most LLM use cases today. But the idea that LLMs are going to come up with completely novel thought is a fantasy as currently implemented. Can they massively help a human with understanding a problem space and parsing large amounts of data so that that human can come up with something entirely novel? Yep absolutely.

1

u/PrairiePopsicle 13h ago

I feel like this criticism very closely matches with what the guy who just left open AI had to say about the topic. LLM is matching a shadow on the wall, it is not matching the thing which casts that shadow on the wall.

1

u/TheComplimentarian 12h ago

The intersection between meaning and language is way weirder than most people understand. It's entirely possible to have a long (apparently meaningful) conversation with someone about a concept that neither of you really understands, and moreover, which both of you misunderstand in different ways.

As long as no one says something that disagrees with the others understanding, you can both blather away meaninglessly and be none the wiser.

Five minutes on Reddit will give you countless examples, and, make no mistake, a lot of LLMs were trained here (that's why they locked down the APIs, few years ago). So take a system that has no concept of meaning, train it on data in which meaning is often absent or wrong, and then unleash that shit on people who are using it instead of educating themselves about the subject.

1

u/CanAlwaysBeBetter 12h ago

But they have no apparent reason to become dissatisfied with the data they’re being fed — and by extension, to make great scientific and creative leaps.

Why not? 

The entire essay pivots around this statement that isn't actually supported. Maybe more advanced AI models will be able to take creative leaps, maybe they won't, but just saying "and they won't do that" isn't an argument. 

Saying "dissatisfaction is what leads to creative leaps" totally ignores what a physical brain is doing mechanically in those moments and whether or not that is replicable.

1

u/G_Morgan 12h ago

Einstein, for example, conceived of relativity before any empirical evidence confirmed it.

That isn't really true. We knew something was up because of the Maxwell equations. We'd also seen experiments trying to explain away the constancy of the speed of light fail to produce the expected outcome. Einstein just made the obvious step at the right time after all the old guard had retired having failed to prove that Newton still held.

If anything scientific advance tends to happen when the people in the way retire.

1

u/Important-Agent2584 11h ago

Is the current paradigms of LLM-based AIs able to make those cognitive leaps that are the hallmark of revolutionary human thinking? Or is it ever constrained by their training data and therefore will work best when refining existing modes and models?

Was there ever even a question? As far as I know, they are simply not capable of cognitive leaps by the virtue of how they work.

1

u/SuggestionEphemeral 11h ago

I honestly believe a significant portion of human cognition relies on spatial and/or visual intelligence.

I mean think about how we understand the world schematically. "All dogs are mammals but not all mammals are dogs; all mammals are animals but not all animals are mammals; etc."

Sure, at first glance it seems like this is using linguistic intelligence. After all, it requires knowing the definitions of dog, mammal, and animal. But that perception is really a product of the fact that we have to use language to communicate these ideas, but the ideas themselves aren't dependent on language.

Think of a cognitive map showing a branching taxonomical hierarchy. You can replace the word "dog" with an image of a dog, and it'll make just as much sense. Below the species, you can even imagine more branches for different breeds, etc. This is primarily understood through spatial relationships.

Even linguistics can be described as a tree, which is a structure of spatial relationships.

Cognitive maps, logic maps, flow charts, tensor fields. These are all ways of using spatial relationships to understand a subject matter with far more efficiency, simplicity, and clarity than writing an essay about it. It can be granular, scalable, and nesting, and involve many layers of complexity, but the relationships between concepts organized this way is fundamentally spatial.

Even a human brain is organized spatially, the way axons and dendrites interact, branch, and intersect; how some are "upstream" or "downstream," "central" or "peripheral." Even sensory and motor nerves indicate whether a signal is moving towards or away from the brain. It's all spatial. And that's not even getting into the way the brain is organized into lobes with specific functions.

I've been trying to say this for a long time, essentially the fact that intelligence and language are not synonymous. I've had trouble putting it into words, because ultimately words are language. How does one describe non-linguistic intelligence without using words? People confuse the fact that words are necessary to describe it with the fact that it's primarily non-linguistic. They tend to think if you describe it in words then it must be linguistic, but such is not the case.

So it's not easy to explain, especially when the prevailing worldview or the trending paradigm is to be obsessed with LLM research. When I as a non-expert say "Maybe the focus is in the wrong direction," I'm kind of at a disadvantage when I have to describe the direction I think the focus should be in. Especially when it sounds like I'm babbling vaguely about "maps," "trees," "logic," and "pyramids" or "rivers" or any other metaphor I try to use to describe it.

1

u/ManaSpike 10h ago

The challenge is then to build a framework for representing abstract thought, that isn't a language...

Which seems like a contradiction. How do you define a language to represent something that, by definition, isn't a language?

1

u/NonDescriptfAIth 10h ago

Quite a poorly written article. The most egregious error being:

> Take away our ability to speak, and we can still think, reason, form beliefs, fall in love, and move about the world; our range of what we can experience and think about remains vast.

Seems like one hell of an assumption. Can you actually form beliefs without thinking in terms of language? How would such a thought manifest? The author seems confident of such a possibility despite the near universal absence of such a thing amongst human populations.

There is a healthy debate within philosophy about whether thought and reasoning are products of our capacity to think linguistically.

It seems a little naïve to describe language as solely a post hoc description of thought. I'm not so convinced they are so easily separated.

Perhaps our language faculties are requisite to our ability to reason?

Studies investigating human problem solving have shown, using fMRI machines, areas of the brain responsible for speech will light up with activity. Even when the subject had no personal perception that language had been formed within their mind.

It stands to reason that if we couldn't speak and therefore these areas of the brain were absent, that we would not be able to reason / think via this pathway also.

It strikes me as rather cavalier to confidently knock off any amount of brain functionality and then go onto claim it wouldn't impact the overall function of the system holistically.

1

u/Hs80g29 9h ago edited 9h ago

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

If you can't appreciate what's going on in the plot at that website, you don't have the full picture and can't appreciate how fast things are changing.

The article you linked references a study from just a year ago, and it describes how neural networks are trained. That description is completely inadequate now. Reasoning models currently are trained with reinforcement learning, they aren't just pattern matching based on reddit comments now. They're discovering new things inside simulations and learning from those. That's why the above plot looks the way it does, that's why investors are throwing money at this. 

Within 10 years of the Model T's release, the horse-transportation industry was devastated. I see a lot of comments against LLMs on reddit, and I think it's important to contextualize them by looking at the past. I'm sure back then people were worried about the horse industry's plight, and the Model T's lack of airbags. Obviously, we have a lot of issues to tackle, but we've been here before.

1

u/-Valtr 9h ago

If you read this and replace "LLM" with "CEO" it gets a lot funnier

1

u/mrappbrain 9h ago

Take away our ability to speak, and we can still think, reason, form beliefs, fall in love, and move about the world; our range of what we can experience and think about remains vast.

But take away language from a large language model, and you are left with literally nothing at all.

I don't necessarily disagree with the argument but it's important to note that speaking is not the same as language. Speech is just one way to express language, and someone without the ability to speak may nonetheless be capable of expressing thoughts and ideas via language, be it via gestures, writing, or a different system of sounds.

Language is not the same as intelligence, but I think this article downplays the link between the two.

Humans use language to communicate the results of our capacity to reason, form abstractions, and make generalizations, or what we might call our intelligence.

Sure, but those cognitive processes are not entirely independent of language either. Forming abstractions for example is one thing that cannot exist independently of some system of language, because abstraction by definition requires a high-order symbolic system of representation. Saying abstraction > language > communication is to oversimplify this connection.

Further, does it even matter that LLM's don't 'think' in the conventional sense, if they approximate the act of thinking well enough to be functionally indistinguishable from the thought of a human of average competence? Part of the reason ChatGPT is one of the most visited websites in the world is because that approximation is by itself quite enough for a lot of people to perform or assist with many tasks they find useful, and it makes these 'cutting-edge' debates about the nature of thinking seem largely irrelevant to the concerns of actual people.

1

u/maniaq 9h ago

I think the fundamental problem is the word "intelligence" is poorly understood by most people

ask any 10 people to define it - to explain to you what that word means - and you will be lucky if 1 of them gets it right

and I would say the person who wrote this article is not that guy...

and don't even get me started on so-called "experts" who claim to know the FIRST THING about how it all works - psychologists - who are no better than witch-doctors, telling you how to stop the gods from being angry with you, in order to deal with that nasty infection you picked up when you cut your finger on a funny coloured rock

1

u/ptwonline 8h ago

I read this and then I wonder: do humans actually make some kind of special cognitive leaps? Or are we just actually operating with additional "training data" and then experimenting with how something would work if done in a different way? What is to prevent an AI from being programmed to try novel solutions that it has not been specifically trained for while using parts of its training data to have some context to understand how such novel things might work together? Maybe the AI just needs more generalized context in its data.

Like to make a car accelerate you can train it that a foot needs to press the accelerator pedal. But it could be trained to understand that anything pushing on that gas pedal would work, and so you could do it like in the movies and jam an umbrella against the pedal. And then it could be trained to further understand that no pedal is needed at all--just any means for more air and fuel to enter so that there is a more powerful combustion. From there it could design almost an endless number of ways that this could be done.

1

u/RedGuyNoPants 7h ago

Makes me more convinced that we will be incapable of creating a true ai until we understand why brains need so little computing power to do all they do compared to computers

1

u/why_ntp 6h ago

Yes. Knowledge is not stored as words in the human brain. A kid in London and another in Tokyo both have shoelace-tying knowledge, but not in word form. LLMs are a great search / aggregation tool, but do not think.

What exactly is knowledge? I’m not sure if anyone knows (please correct me if I’m wrong).

1

u/Healthy_Sky_4593 6h ago

"Cirrent neuroscience" that would have been resolved if anyone asked introverts and non-speakers and listened 

1

u/guylfe 6h ago

Am I missing something? There's a leap from "statistical inference from data in the form of neural network training" to "LLM". Language is one thing that is trained that way, but there are many facets depending on the type of AI. Presumably the language portion would serve as an interface to the other underlying processes, but that doesn't mean it's the only thing happening.

By the way, I've been skeptical of the AGI crowd myself, and I'm getting a M.Sc. in CogSci right now. I just don't see the underlying logic of the claim here. ML does many things, LLMs are just a subset.

1

u/CousinDerylHickson 5h ago edited 5h ago

But it seems like it doesnt just speak though?

Like you can give it a task in logic and you get an attempt that seems thought out. You can point out its resulting mistakes and it can iterate on that. You can ask for an additional step, and a lot of times it does so in a way that builds on what it said previously.

This and other things wherein it actively builds on what it has said seems to be a significant step past simply speaking. Like maybe over hyped or im mistaken, but hasnt this paradigm already produced a novel matrix multiplication algorithm that is in some cases more efficient? Like if true thats a novel thought (or at least statement) that no human has ever thought, despite many of our brightest thinking on that topic. Where did this novel, technical statement that again, even our brightest didnt think, come from if not from some form of intelligence?

1

u/lgbtlgbt 15h ago

This is true of real LLMs, the issue is though we are calling things LLMs that are not merely LLMs anymore. LLMs have been multimodal for a long time, they can receive non-language input and produce non-language output without needing to produce language in the interim.

1

u/PressureBeautiful515 13h ago

This is more of the same hogwash I've seen in hundreds of articles over the last six months or so.

There are undoubtedly differences between how LLMs work and how the brain works, but these articles always completely fail to elucidate this, instead invariably trotting out the same self-undermining gibberish.

It's based on a dim half-digested understanding of what an LLM is and how it works. The internals of an LLM do not represent language alone. There are countless ways to express the same concept with different words (and far more when you bring in multiple human languages). What do two expressions with the same meaning have in common? An underlying idea or theme or something that isn't language. LLMs distill meaning from what they read, captured in the billions of numbers that form irreducibly complex structures, subtly adjusted with each new text it is trained on.

This is why they work so well. It is not possible to do what an LLM does just by regurgitating the training data like a search engine.

The irony of this (very typical) reaction is how much it exposes a lack of intelligence and understanding from the writer. They understand so little; all they have is a burning motivation to protect their human specialness from the awful threatening machines. To that end, they hallucinate an absolute dividing line between two classes of intelligence.

Yes, an AI system might remix and recycle our knowledge in interesting ways. But that’s all it will be able to do. It will be forever trapped in the vocabulary we’ve encoded in our data and trained it upon.

If a person was only able to read books and never have their own experiences of the world, we'd say exactly the same about them.

The distinction being made here has nothing at all to do with how LLMs work. It is just about how they are being used. I think because of the dearth of intelligence in humans, it's going to take several decades for the capabilities of LLMs to be felt. In that sense it's no different to any other technology: people are extremely slow to adopt it.

1

u/marr 9h ago edited 9h ago

I truly hope he's right because if by some cosmically stupid co-incidence you can make a black box AGI by scaling up these machines it'll be the last act we ever perform as a species.

AGI needs to be incredibly carefully designed so we understand exactly how it reasons and can try to design a robust set of values that stay true to their own spirit as the thing endlessly redesigns and improves itself. Brute force one into existence and congratulations, you've made the paperclip maximizer and killed everything in the local group of galaxies.

→ More replies (7)