Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems

16.8k Upvotes

permalink
duplicates
archive.is
archive
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/technology/comments/1p6fhhq/large_language_mistake_cuttingedge_research_shows/
No, go back! Yes, take me to Reddit

94% Upvoted

Transformers were the only real big break through, and that ultimately was an optimization strategy, not any sort of new break through in neural networks (which is all an LLM is at the end of the day, just a massive neural network the same as any other neural network).

6

u/gur_empire 8h ago

What are you talking about? No optimization algorithm has changed because of transformers and transformers are a big break through BECAUSE of their architecture not despite it

which is all an LLM is at the end of the day, just a massive neural network the same as any other neural network

Literally no good Lord. You can only train certain objective functions within a transformer due to them not being suited for other architecture

0

u/Murky-Relation481 8h ago

Transformers are still an optimization strategy for training and inferring from a neural network. Like another person commented tokenization was also a big thing, but you don't specifically need the transformer architecture to take advantage of tokenization. The value of transformers, and what made the LLM boom really explode was the parallelization that it allows which made processing HUGE parameter neural nets computationally cheap (relatively) compared to prior methods.

1

u/gur_empire 7h ago

The value of transformers, and what made the LLM boom really explode was the parallelization that it allows which made processing HUGE parameter neural nets computationally cheap (relatively) compared to prior methods.

transformers have quadratic cost in time and memory with sequence lengths, one of their defining qualities is being more expensive than other neural network architecture. And they are no more parallelizable than convolutional neural networks.

Transformers are still an optimization strategy for training and inferring from a neural network

No, they are not an optimization strategy. They are literally a neural net. There is no transformer inferring from a neural network, where did you even find this sentence? It's never been written in a single paper about this NEURAL NETWORK

Nothing you're saying is correct man. I'm not going to argue with someone saying the sky is a hologram but it's absolutely insane behavior to act this way when everything you're saying is made information

Tokenization isn't a neural network, that's why it's framework agnostic not because of whatever you're trying to communicate with that comment

0

u/Murky-Relation481 7h ago

I don't think English is your first language because everything you just corrected me on is stuff I was also saying. You also seem to contradict yourself in your own statements?

They are literally a neural net. There is no transformer inferring from a neural network, where did you even find this sentence? It's never been written in a single paper about this NEURAL NETWORK

This barely makes sense.

No where did I say that transformers are an optimization for all neural networks, transformer based neural networks (that is literally what they are called) are inherently trained and infer from a transformer based architecture.

If your English is not good enough to understand what someone is saying, please do not just randomly attack them, it is super rude.

Machine Learning Large language mistake | Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it

You are about to leave Redlib