[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning

Created by
  • Haebom

Author

Justin Chih-Yao Chen, Sukwon Yun, Elias Stengel-Eskin, Tianlong Chen, Mohit Bansal

Outline

This paper presents a method to combine existing pre-trained expert LLMs (Large Language Models) to efficiently handle large-scale and diverse tasks. To overcome the limitations of existing task-based expert selection methods, we propose a Symbolic-MoE framework that enables instance-level adaptive expert mixing. Symbolic-MoE dynamically selects relevant expert LLMs through a fine-grained approach that focuses on skills, such as algebra in mathematics and molecular biology in biomedical reasoning. Each selected expert generates its own inference, and the results are synthesized into a final high-quality response through an aggregator selected based on its ability to integrate various inference results. To address the high computational overhead of model loading and unloading, we implement a batch strategy that groups instances based on assigned experts to improve efficiency. Our approach outperforms GPT4o-mini and multi-agent approaches on various benchmarks (MMLU-Pro, GPQA, AIME, MedMCQA), achieving an average performance improvement of 8.15% over the best multi-agent baseline model. Additionally, it generalizes well to new tasks and outperforms discussion-based baseline models by not requiring costly multi-round discussions.

Takeaways, Limitations

Takeaways:
Possibility of improving LLMs performance through instance-level expert selection
Demonstrating the effectiveness of a skill-based expert selection strategy
Reduce computational overhead through efficient placement strategies
Achieves performance that surpasses existing top-performing models in a variety of benchmarks
Achieving excellent performance and improving generalization performance without multi-round discussions
Limitations:
Further research is needed on the scalability of the proposed method (performance and efficiency when using more expert models).
Additional verification of the objectivity and reliability of the evaluation criteria for expertise in specific technologies is needed.
There is a need to further evaluate generalization performance across different task types.
Further research is needed to optimize the aggregator selection strategy.
👍