This paper applies reinforcement learning (RL) to the game of chess to improve the strategic reasoning ability of large-scale language models (LLMs). We utilize a knowledge distillation method that provides dense rewards for the quality of the LLM's outputs, leveraging a pre-trained action-value network on chess. Experimental results show that dense rewards outperform sparse binary rewards, but all models fall far short of expert-level performance. The results suggest that the pre-trained models' lack of understanding of chess is the primary cause, and that RL alone cannot fully overcome this limitation. The code is available on GitHub.