Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

ReasonBridge: Efficient Reasoning Transfer from Closed to Open-Source Language Models

Created by
  • Haebom

Author

Ziqi Zhong, Xunzhu Tang

Outline

This paper presents ReasonBridge, a novel methodology to bridge the performance gap between closed and open-source models in large-scale language models (LLMs) that require complex inference and precise instruction compliance. ReasonBridge efficiently transfers the inference capabilities of powerful closed models to open-source models through a hierarchical knowledge distillation framework. Using a dataset of 1,000 curated inference traces, Reason1K, we filter inference traces extracted from various domains through a multi-criteria selection algorithm. The methodology integrates a hierarchical distillation process, a sparse inference-centric adapter architecture that requires only a small amount of additional learning parameters (0.3%), and a test-time compute scaling mechanism that uses guided inference intervention. Experimental results show that ReasonBridge significantly improves the inference capabilities of open-source models by up to 23% on benchmark tasks, significantly reducing the performance gap with closed models. In particular, the improved Qwen2.5-14B outperforms Claude-Sonnet3.5 on MATH500 and is on par with the AIME problem. This methodology generalizes effectively across a variety of inference domains and model architectures, and presents a sample-efficient approach to improving inference for instruction compliance.

Takeaways, Limitations

Takeaways:
A new methodology is presented to effectively transfer the inference capabilities of closed LLM to open source LLM.
Significantly improve the inference performance of open source models using a small amount of data (Reason1K)
Shows generalizability across a variety of domains and model architectures
Contributes to improving the competitiveness of open source models and reducing the performance gap with closed models
Limitations:
Lack of detailed description of the composition process and quality of the Reason1K dataset.
Further research is needed on the scalability of the proposed methodology and its generalizability to other types of inference tasks.
It is not a fully open source solution due to its dependency on closed models.
Lack of detailed description of test time computing scaling mechanism
👍