[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters

Created by
  • Haebom

Author

Shanbo Cheng, Yu Bao, Qian Cao, Luyang Huang, Liyan Kang, Zhicheng Liu, Yu Lu, Wenhao Zhu, Zhichao Huang, Tao Li, Sitong Liu, Ningxin Peng, Shuaijie She, Lu Xu, Nuo Xu, Sen Yang, Runsheng Yu, Yiming Yu, Liehao Zou, Hang Li, Lu Lu, Yuxuan Wang, Yonghui Wu

Outline

Seed-X is an open-source multilingual large-scale language model (LLM) family with 7B parameter size. Based on a pre-trained base model using diverse and high-quality monolingual and bilingual data from 28 languages, a fine-tuned directed model via Chain-of-Thought (CoT) inference improves generalization performance across multiple language pairs via reinforcement learning (RL). It achieves performance comparable to state-of-the-art closed-loop models such as Gemini-2.5 and GPT-4o on 28 languages, and significantly outperforms larger open-source models in both automated evaluation metrics and human evaluation. By sharing best practices in the optimization process and making parameters public, we contribute to the advancement of translation research and applications.

Takeaways, Limitations

Takeaways:
Achieving multilingual translation performance similar to state-of-the-art closed-form models with a relatively small size of 7B parameters.
Released as open source, contributing to the development of multilingual translation research and applications.
Presenting a performance improvement strategy through CoT inference and reinforcement learning.
Excellent generalization performance across a variety of language pairs.
Limitations:
The specific Limitations is not mentioned in the paper. There is room for improvement through further research.
👍