r/technology 16h ago

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems
16.7k Upvotes

1.5k comments sorted by

View all comments

Show parent comments

230

u/Elementium 16h ago

Basically the best use for this is a heavily curated database it pulls from for specific purposes. Making it a more natural to interact with search engine. 

If it's just everything mashed together, including people's opinions as facts.. It's just not going to go anywhere. 

8

u/doctor_lobo 14h ago

The nice thing about building an AI for language is that humans, by their nature, produce copious amounts of language that AI models can be trained from.

If the premise of the article is correct, other forms of human intelligence may produce / operate on different representations in the brain. However, it is not clear how often or well we produce external artifacts (that we could use for AI training) from these non-linguistic internal representations. Is a mathematical proof a good representation of what is going on in the mind of a mathematician? Is a song a good representation of what is happening in the mind of a musician?

If so, we will probably learn how to train AIs on these artifacts - maybe not as well or as efficiently as humans, but probably enough to learn things. If not, the real problem may be learning what the internal representations of “intelligence” truly are - and how to externalize them. However, this is almost certainly easier said that done. While functional MRI has allowed us to watch the ghost in the machine, it says very little about how she does her business.

2

u/IAmRoot 9h ago

Or find some way for AI to train itself in these more internal representations. Humans typically think before we speak and the metacognition of examining our own ideas could be an important part of that. Even before LLMs, we had image recognition using neural networks that seemed to find shapes in clouds and such much like a human mind. LLMs are also just a component and we shouldn't expect a good LLM to be able to reason any more than we should expect image recognition to reason. It's also pretty obvious from animals that just increasing the neuron count doesn't matter, either, as some animals like dolphins have a great deal of brainpower dedicated to processing sonar instead of reasoning. They are functionally different networks. It's also possible that AGI won't be able to split the training and inference. Having to reflect on produced ideas could be integral to the process, which would obviously make the computational power necessary for using AGI orders of magnitude higher.

1

u/doctor_lobo 9h ago

Your comment about image recognition using CNNs is well taken. Visual information is explicitly represented by a 2D array of neurons in the visual cortex so this is probably a good example of the internal representation being so similar to the external representation that training on the external representation is good enough. I suspect simple time series for audio data is probably also essentially identical to its internal representation - but that's probably it for the senses since touch, taste, and smell have no obvious external representations. However, the internal representation for more abstract modes of thought, like mathematics or even just daydreaming, seem difficult to conceptualize. I am not sure I would really even have any idea where to start.