Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning

Created by
  • Haebom

Author

Jianfeng Pan, Senyou Deng, Shaomang Huang

Outline

This paper proposes the Chain-of-Associated-Thoughts (CoAT) framework, which replaces the "fast thinking" approach of traditional LLMs with a "slow thinking" approach more closely resembling human thought processes. CoAT significantly expands the exploration space of LLMs by combining the Monte Carlo Tree Search (MCTS) algorithm with a novel key information integration mechanism called "associative memory." Leveraging the structural exploration capabilities of MCTS and the adaptive learning capabilities of associative memory, CoAT explores multiple inference paths and dynamically updates the knowledge base in real time. This allows it to review and improve previous inferences and adaptively integrate evolving information to produce accurate and comprehensive final results. We achieve performance gains of over 10% (open source datasets) and over 15% (CRB dataset) on open-source multi-stage inference datasets such as HotpotQA and MuSiQue, as well as on our own CRB dataset.

Takeaways, Limitations

Takeaways:
A new "slow thinking" framework that overcomes the limitations of existing LLMs.
Performance enhancement through effective combination of MCTS and associative memory mechanisms.
Provides various inference path exploration and real-time knowledge base update functions.
Validation of practicality through performance improvement on various datasets
Limitations:
Lack of specific description of the presented CRB dataset.
Further explanation is needed regarding the specific operation and limitations of associative memory mechanisms.
Lack of comparative analysis with other state-of-the-art LLM models
Lack of analysis on scalability and computational costs
👍