Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

Created by
  • Haebom

Author

Zheng Li, Qingxiu Dong, Jingyuan Ma, Di Zhang, Kai Jia, Zhifang Sui

Outline

This paper proposes SelfBudgeter, a user-friendly, adaptive, and controllable inference framework, to address the problem that inference models that excel at complex problems tend to overthink simple problems. SelfBudgeter integrates a budget estimation mechanism before inference and employs a dual training approach. First, the model learns to predict token budgets in a standardized format. Through a reinforcement learning phase, the model is trained to autonomously plan and strictly adhere to budgets based on problem difficulty. SelfBudgeter outputs budget estimates early in the process, allowing users to predict waiting times and manually control inference length through pre-filled budget fields. Experimental results show that SelfBudgeter dynamically allocates budgets based on problem complexity, achieving an average response length compression ratio of 61% for the 1.5B model and 48% for the 7B model on the GSM8K, MATH500, and AIME2025 datasets, while maintaining near-perfect accuracy.

Takeaways, Limitations

Takeaways:
Improved user experience: Wait time prediction enables flexible decisions about whether to interrupt or continue the production process.
Improving resource efficiency: Compressing response length through dynamic budget allocation based on problem difficulty.
Controllability: Manual control of inference length via pre-populated budget fields.
Maintaining model performance: Maintaining high accuracy despite response length compression.
Limitations:
For more detailed information, including the specific model architecture, dataset, and training method, please refer to the original paper.
Further research is needed on generalization performance across different problem types and model sizes.
Further evaluation of the actual usability and effectiveness of user-controlled features is needed.
👍