[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

A Novel Self-Evolution Framework for Large Language Models

Created by
  • Haebom

Author

Haoran Sun, Zekun Zhang, Shaoning Zeng

Outline

In this paper, we propose a novel dual-stage self-evolution (DPSE) framework that overcomes the limitations of pre-training and improves the performance of large-scale language models (LLMs). Unlike existing post-training strategies that focus on improving user alignment, DPSE simultaneously optimizes user preference adaptation and domain-specific expertise. This is achieved by extracting multidimensional interaction signals through a censoring module and estimating satisfaction scores, and augmenting structured data through topic-aware and preference-based strategies. The augmented dataset supports a two-stage fine-tuning pipeline: supervised domain-based tuning and frequency-aware preference optimization. Experimental results on common NLP benchmarks and long-term conversation tasks show that DPSE outperforms supervised fine-tuning, preference optimization, and memory augmentation baseline models, and the contributions of each module are verified through ablation studies. In conclusion, the DPSE framework provides an autonomous path for continuous self-evolution of LLMs.

Takeaways, Limitations

Takeaways:
We present a novel post-training framework that simultaneously enhances user preference adaptation and domain-specific expertise.
We present the possibility of autonomous self-evolution of LLM through a censorship module and a two-stage fine-tuning pipeline.
Demonstrated superior performance compared to existing methods in various benchmarks.
Limitations:
Further analysis of the computational cost and scalability of the proposed framework is needed.
Need to verify generalization performance on various types of LLM and datasets.
In-depth research is needed into the reliability and bias issues of the censorship module.
Consideration should be given to the unpredictability and safety issues that may arise during the long-term self-evolution process.
👍