Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Can Large Language Models Develop Strategic Reasoning? Post-training Insights from Learning Chess

Created by
  • Haebom

Author

Dongyoon Hwang, Hojoon Lee, Jaegul Choo, Dongmin Park, Jongho Park

Outline

This paper applies reinforcement learning (RL) to the game of chess to improve the strategic reasoning ability of large-scale language models (LLMs). We utilize a knowledge distillation method that provides dense rewards for the quality of the LLM's outputs, leveraging a pre-trained action-value network on chess. Experimental results show that dense rewards outperform sparse binary rewards, but all models fall far short of expert-level performance. The results suggest that the pre-trained models' lack of understanding of chess is the primary cause, and that RL alone cannot fully overcome this limitation. The code is available on GitHub.

Takeaways, Limitations

Takeaways: We validated the applicability of RL to enhance the strategic reasoning ability of LLMs through a chess game. We confirmed the effectiveness of knowledge distillation-based dense rewards.
Limitations: None of the models reached expert-level performance. The lack of internal understanding of chess in the pretrained models exposed the limitations of RL learning. This suggests that RL alone is unlikely to fully enhance the strategic reasoning abilities of LLM.
👍