Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Interactive Learning for LLM Reasoning

Created by
  • Haebom

Author

Hehai Lin, Shilei Cao, Minzhi Li, Sudong Wang, Haotian Wu, Linyi Yang, Juepeng Zheng, Chengwei Qin

Outline

ILR is a novel collaborative learning framework for multi-agent learning (MAS). It studies whether interactions between LLMs can enhance the independent problem-solving abilities of LLMs. It integrates two core components: Dynamic Interaction and Perception Calibration, dynamically selecting cooperative or competitive strategies based on question difficulty and model capabilities, and exchanging information through Idea3 (idea sharing, idea analysis, and idea fusion). Perception Calibration trains LLMs using GRPO, and integrates the reward distribution characteristics of one LLM into the reward function of another LLM to enhance the cohesiveness of multi-agent interactions. ILR was experimented with three LLMs of various scales on mathematical and coding benchmarks, consistently outperforming single-agent learning.

Takeaways, Limitations

ILR has demonstrated that multi-agent interaction can enhance the independent problem-solving ability of LLM.
Dynamic Interaction increases learning efficiency by dynamically selecting cooperative or competitive strategies.
Idea3 presents a new interaction paradigm for effective information exchange between LLMs.
Perception Calibration enhances the cohesion of multi-agent interactions.
Experimental results show that ILR improves performance by up to 5% over single-agent learning.
Idea3 enhances the multi-agent inference capabilities of more powerful LLMs.
Dynamic interaction strategies outperformed purely cooperative or competitive strategies.
The specific Limitations is not specified in the paper.
👍