This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities
Created by
Haebom
Author
Shuo Cai, Su Lu, Qi Zhou, Kejing Yang, Zhijie Sang, Congkai Xie, Hongxia Yang
Outline
This paper presents InfiAlign, an efficient post-training framework for improving the inference performance of large-scale language models (LLMs). InfiAlign aligns LLMs by combining supervised fine-tuning (SFT) and direct affinity optimization (DPO). Its core is a robust data selection pipeline that automatically selects high-quality alignment data from open-source inference datasets using multidimensional quality metrics. Applying it to the Qwen2.5-Math-7B-Base model, we demonstrate that it achieves comparable performance to existing models using only about 12% of the original data, demonstrating strong generalization across a variety of inference tasks. Specifically, applying DPO yields an average performance improvement of 3.89% on mathematical inference tasks. By combining principled data selection with pre-training, InfiAlign provides a practical solution for aligning large-scale inference models in a scalable and data-efficient manner. Model checkpoints are available at https://huggingface.co/InfiX-ai/InfiAlign-Qwen-7B-SFT .