Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing

Created by
  • Haebom

Author

Ruihan Jin, Pengpeng Shao, Zhengqi Wen, Jinyang Wu, Mingkuan Feng, Shuai Zhang, Jianhua Tao

Outline

This paper presents a routing technique that efficiently selects the optimal LLM for a specific task among various large-scale language models (LLMs). To address the lack of correlation between user queries and LLM characteristics in existing methods, we propose a novel framework called RadialRouter. RadialRouter uses a lightweight Transformer-based backbone with a radial architecture called RadialFormer to clarify the query-LLM relationship and selects the optimal LLM based on the final state of RadialFormer. We enhance robustness by implementing an objective function that combines Kullback–Leibler divergence and query-query contrastive loss. Our RouterBench experiments demonstrate that our method outperforms existing routing methods by 9.2% in the Balanced scenario and 5.8% in the Cost First scenario, demonstrating its practical applicability through its adaptability to performance-cost tradeoffs and dynamic LLM pools.

Takeaways, Limitations

Takeaways:
Introducing RadialRouter, a new LLM routing framework based on RadialFormer.
Demonstrated improved performance and cost-effectiveness compared to existing methods (based on RouterBench)
Adaptability to various performance-cost tradeoffs and dynamic LLM pools
We present a method for effectively learning the correlation between query and LLM features.
Limitations:
Dependency on the RouterBench dataset. Generalization performance needs to be verified on other datasets.
Further analysis of the structural complexity and computational cost of RadialFormer is needed.
Further research is needed on applicability and scalability in real commercial environments.
👍