Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

TRepLiNa: Layer-wise CKA+REPINA Alignment Improves Low-Resource Machine Translation in Aya-23 8B

Created by
  • Haebom

Author

Toshiki Nakai, Ravi Kiran Chikkala, Lena Sophie Oberkircher, Nicholas Jennings, Natalia Skachkova, Tatiana Anikina, Jesujoba Oluwadara Alabi

Outline

This study explores how to enhance the translation quality of low-resource languages (LRLs) in India by enhancing cross-linguistic similarity within specific internal layers of a decoder-only multilingual large-scale language model (LLM). To address the resource constraints of low-resource languages (LRLs), we propose TRepLiNa, which combines centered kernel alignment (CKA), a similarity measure that encourages representation alignment, and REPINA, a regularization method that constrains parameter updates to be close to the pre-trained model. Using the shared MMLoSo working language pair (Mundari, Santali, Bhili) as a pivot point, we experimented with zero-shot, few-shot, and fine-tuning settings using the Aya-23 8B model and QLoRA. Our results demonstrate that aligning mid-level layers using TRepLiNa (CKA+REPINA) is a practical and cost-effective approach for improving LRL translation, especially in data-poor environments.

Takeaways, Limitations

Takeaways:
Intermediate layer alignment using TRepLiNa (CKA+REPINA) is an effective way to improve the quality of low-resource language translation.
This is especially useful in data-poor environments.
It can be implemented at low cost.
Limitations:
Specific references to Limitations are not included in the paper. (The paper will need to be reviewed later.)
👍