Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

Created by
  • Haebom

Author

Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du

Outline

This paper addresses improving the performance of large-scale language models (LLMs) that utilize tools to interact with their environments. Existing supervised learning fine-tuning (SFT) approaches rely on large datasets and suffer from the limitation of overlooking task characteristics. To address this, the researchers analyzed three existing LLMs and found that training data interferes with tool use behavior, token importance is unevenly distributed, and tool invocation errors are concentrated in specific categories. Based on these findings, the researchers propose TL-Training, a task-feature-based framework. TL-Training mitigates the effects of suboptimal training data, dynamically adjusts token weights to prioritize important tokens in SFTs, and optimizes an enhanced reward mechanism tailored to error categories through proximal policy optimization. Training CodeLLaMA-2-7B and evaluating it on four open-source test sets demonstrates that even with a limited training data set (1,217 tokens), TL-Training achieves tool use performance comparable to or superior to open- and closed-source LLMs. Additionally, it provides a scalable and efficient paradigm for tool-use training in LLM, improving robustness in noisy environments and general task performance. Code and data are available at https://github.com/Junjie-Ye/TL-Training .

Takeaways, Limitations

Takeaways:
We present an efficient training framework (TL-Training) that achieves excellent tool usage performance even with limited training data.
Improved robustness in noisy environments and improved general task performance.
Presenting a scalable and efficient paradigm for tool-use training in LLM.
Analyze the existing SFT method Limitations and suggest improvement measures.
Limitations:
The performance of TL-Training may be limited to specific LLMs and datasets.
Generalization performance verification is needed for various tools and task types.
More extensive experimental and comparative studies are needed to verify generalization performance and versatility.
The small size of the training data used necessitates further research on generalizability in real-world large-scale application scenarios.
👍