Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

Created by
  • Haebom

Author

Prateek Yadav, Colin Raffel, Mohammed Muqeeth, Lucas Caccia, Haokun Liu, Tianlong Chen, Mohit Bansal, Leshem Choshen, Alessandro Sordoni

Outline

This paper comprehensively surveys and analyzes the rapidly growing field of model gathering (MoErging). The widespread adoption of high-performance pre-trained models has led to the emergence of numerous fine-tuned expert models tailored to specific domains or tasks. Model gathering methods that reuse these expert models to improve performance and generalization are attracting attention. This paper presents a novel taxonomy that categorizes the design options for various model gathering methods and clarifies the appropriate application areas for each method. Furthermore, we survey software tools and applications that utilize model gathering methods and discuss related research areas such as model integration, multi-task learning, and expert mixture models. This provides a comprehensive overview of the model gathering field and lays the foundation for future research.

Takeaways, Limitations

Takeaways:
Providing a comprehensive analysis and classification system of model collection methods to enhance researchers' understanding and enable them to set efficient research directions.
The optimal method can be selected through comparative analysis of the pros and cons of various model collection methods.
Increased practical applicability by providing a list of software tools and applications related to model collection.
Contribute to academic convergence and development by suggesting connections with related research fields.
Limitations:
It is possible that the classification scheme presented in this paper does not fully encompass all model collection methods.
Differences in experimental setup make direct comparisons between methods difficult.
As new model collection methods are continuously developed, the contents of this paper may quickly become outdated.
👍