This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
This paper presents a routing technique that efficiently selects the optimal LLM for a specific task among various large-scale language models (LLMs). To address the lack of correlation between user queries and LLM characteristics in existing methods, we propose a novel framework called RadialRouter. RadialRouter uses a lightweight Transformer-based backbone with a radial architecture called RadialFormer to clarify the query-LLM relationship and selects the optimal LLM based on the final state of RadialFormer. We enhance robustness by implementing an objective function that combines Kullback–Leibler divergence and query-query contrastive loss. Our RouterBench experiments demonstrate that our method outperforms existing routing methods by 9.2% in the Balanced scenario and 5.8% in the Cost First scenario, demonstrating its practical applicability through its adaptability to performance-cost tradeoffs and dynamic LLM pools.
Takeaways, Limitations
•
Takeaways:
◦
Introducing RadialRouter, a new LLM routing framework based on RadialFormer.
◦
Demonstrated improved performance and cost-effectiveness compared to existing methods (based on RouterBench)
◦
Adaptability to various performance-cost tradeoffs and dynamic LLM pools
◦
We present a method for effectively learning the correlation between query and LLM features.
•
Limitations:
◦
Dependency on the RouterBench dataset. Generalization performance needs to be verified on other datasets.
◦
Further analysis of the structural complexity and computational cost of RadialFormer is needed.
◦
Further research is needed on applicability and scalability in real commercial environments.