Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles

Created by
  • Haebom

Author

Antara Raaghavi Bhattacharya, Isabel Papadimitriou, Kathryn Davidson, David Alvarez-Melis

Cross-linguistic Numeral Systems in Large Language Models: Challenges and Future Directions

Outline

This paper focuses on the challenge of addressing the diversity of number systems across languages, exploring why large-scale language models (LLMs) struggle to solve linguistic-mathematical puzzles using these systems. We demonstrate that while humans successfully solve these puzzles, LLMs struggle. We conduct experiments to disentangle the linguistic and mathematical aspects of number composition and combination. Our results show that LLMs consistently fail to solve problems unless the mathematical operation is explicitly represented as a symbol (e.g., "20 + 3"). Furthermore, we analyze the impact of individual parameters of number composition and combination on performance. We conclude that while humans understand and reason about the inherent structure of number systems, LLMs lack a concept of this inherent structure.

Takeaways, Limitations

Takeaways:
LLM shows consistent performance only when the mathematical operations of the number system are explicitly represented.
LLMs struggle to understand the inherent structure of number composition and combinations.
LLM lacks the ability to reason through linguistic understanding that humans use.
Limitations:
Current inference models struggle to flexibly infer configurational rules from implicit patterns in human-scale data.
Further research is needed to improve the verbal-mathematical reasoning abilities of LLMs.
There is a need to develop new methodologies to model the inherent structure of number systems.
👍