Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models

Created by
  • Haebom

Author

Yongjiang Liu, Haoxi Li, Xiaosong Ma, Jie Zhang, Song Guo

Outline

This paper proposes a novel two-step fine-tuning strategy, Think-How-to-Think (TH2T), to address the problem of excessive inference in large-scale inference models (LRMs). TH2T first injects difficulty level awareness into the model to adjust the inference depth, and then reduces excessive inference by identifying and removing unnecessary inference patterns in intermediate inference stages. It is trained using a dataset with a mixture of short and long inference paths, and experimental results on 7B, 14B, and 32B models demonstrate that it maintains performance while reducing inference costs by over 70% on easy tasks and over 40% on difficult tasks.

Takeaways, Limitations

Takeaways:
A novel method to effectively address the over-inference problem in large-scale language models is presented.
Increase model efficiency without compromising performance while significantly reducing inference costs.
Improved ability of the model to recognize the difficulty of a task and adjust its inference process accordingly.
Increase the efficiency of the inference process by eliminating unnecessary repetition or unnecessary information in the intermediate inference steps.
Limitations:
Further research is needed to determine the generality of the proposed method (experiments on various types of problems and models are needed).
Detailed explanations of the specific implementation methods of "difficulty hypnosis" and "redundancy hypnosis" may be lacking.
There is a possibility of dependency on a specific dataset. There is a possibility of performance degradation when extending to other datasets.
👍