Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Kuwain 1.5B: An Arabic SLM via Language Injection

Created by
  • Haebom

Author

Khalil Hennara, Sara Chrouf, Mohamed Motaism Hamed, Zeina Aldallal, Omar Hadid, Safwan AlModhayan

Outline

This paper presents a novel method for efficiently integrating a new language into an existing large-scale language model (LLM). We trained a small, open-source, English-based model, Kuwain, with 1.5 billion parameters, by injecting Arabic into it. We achieved an average 8% improvement in Arabic performance while preserving existing knowledge, offering a cost-effective alternative to training a comprehensive model for both English and Arabic. This demonstrates the potential for efficient, goal-oriented scaling of language models without extensive retraining or resource-intensive processes.

Takeaways, Limitations

Takeaways:
A new way to efficiently add new languages to existing LLMs.
Improved target language performance (8% on average) while minimizing loss of existing knowledge.
Presenting the possibility of cost-effectively building a multilingual LLM program without extensive retraining.
Limitations:
The Kuwain model is relatively small (1.5 billion parameters), so its performance when applied to larger models is uncertain.
Further research is needed on generalizability across different languages and specific language pairs.
The effectiveness of the proposed method may vary depending on the open source model used and the characteristics of the target language.
👍