This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
Created by
Haebom
Author
Chen Shani, Liron Soffer, Dan Jurafsky, Yann LeCun, Ravid Shwartz-Ziv
Outline
This study applies the information bottleneck principle to compare LLMs with humans in balancing compression and semantic preservation. By analyzing and comparing embeddings from over 40 LLMs with human categorization benchmarks, we reveal differences between LLMs and humans.
Takeaways, Limitations
•
Takeaways:
◦
LLM roughly matches human categories, but misses fine-grained semantic distinctions important for human understanding.
◦
LLMs perform aggressive statistical compression to achieve “optimal” information-theoretic efficiency, whereas humans prioritize contextual richness and adaptive flexibility.
◦
The encoder model outperforms the decoder model in human alignment.
◦
Conceptual structures develop in distinct stages, with initial rapid formation and structural reorganization, and semantic processing moving from deep layers to intermediate network layers as the model discovers more efficient encodings.
•
Limitations:
◦
LLM optimizes compression, while humans optimize adaptive usability, revealing a fundamental difference between artificial intelligence and biological intelligence.