Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference

Created by
  • Haebom

Author

Samir Abdaljalil, Erchin Serpedin, Khalid Qaraqe, Hasan Kurban

Outline

This paper presents a controlled evaluation framework for assessing the ability of large-scale language models (LLMs) to consistently and logically ground their consistency in multilingual environments. We generate synthetic, logic-based premise-hypothesis pairs translated into a morphologically diverse set of languages and conduct tests under both monolingual and mixed-language (code-switching) conditions. We demonstrate the surprising result that code-switching can improve performance rather than degrade it, suggesting that translation-induced lexical changes can serve as regulatory signals. We verify the fidelity of translated pairs using embedding-based similarity analysis and cross-language alignment visualization. In conclusion, we demonstrate the potential and vulnerabilities of current cross-language inference in LLMs and present code-switching as a promising approach for improving multilingual robustness.

Takeaways, Limitations

Takeaways:
Presenting a controlled framework for multilingual NLI assessment.
We demonstrate that code switching can contribute to improving the multilingual reasoning performance of LLM.
Suggesting that lexical changes due to translation can act as regulatory signals for the model.
It simultaneously demonstrates the potential and vulnerability of LLM's cross-linguistic reasoning abilities.
Limitations:
Evaluation based on synthetic data requires verification of generalizability to real-world data.
Further research is needed to determine the generalizability of the results to specific language sets and LLMs.
A more in-depth analysis of the effects of code switching and the identification of its mechanisms are needed.
👍