Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Self-Evolving LLMs via Continual Instruction Tuning

Created by
  • Haebom

Author

Jiazheng Kang, Le Huang, Cheng Hou, Zhe Zhao, Zhenxiang Yan, Chuan Shi, Ting Bai

Outline

This paper proposes MoE-CL, a parameter-efficient adversarial mixture of experts (MoE) framework for continuous learning (CL) of large-scale language models (LLMs) to address the diverse and constantly evolving tasks in industrial settings. To address the forgetting problem, a critical weakness of existing CL approaches, MoE-CL adopts a dual-expert design that utilizes task-specific experts and shared experts. Task-specific experts maintain knowledge specific to each task, while shared experts facilitate transfer across tasks. A generative adversarial network (GAN)-based task-aware discriminator is integrated to prevent shared experts from imparting task-irrelevant noise. Through adversarial learning, shared experts learn generalized representations, while task-specific experts retain task-specific details, achieving a balance between knowledge retention and cross-task generalization. We validate the effectiveness and practicality of MoE-CL through experiments on the public MTL5 benchmark, the Tencent3 industrial benchmark, and A/B testing on the Tencent Video platform's content compliance review system.

Takeaways, Limitations

Takeaways:
Providing effective solutions to the challenges of continuous learning in the LLM industry (MoE-CL).
Mitigating the forgetting problem through a dual expert design that utilizes task-specific experts and shared experts.
Achieving a balance between cross-task generalization and knowledge retention through adversarial learning.
Verification of cost savings through practical application on the Tencent Video platform (15.3% reduction).
Presenting a practical methodology suitable for large-scale industrial deployment.
Limitations:
The performance of the proposed MoE-CL may be limited to specific benchmarks and industrial environments.
A more comprehensive comparative analysis with other CL approaches is needed.
Detailed descriptions of the design and training process of GAN-based discriminators may be lacking.
Additional experiments and validation may be required considering the various characteristics of industrial environments.
👍