Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Multilingual Performance Biases of Large Language Models in Education

Created by
  • Haebom

Author

Vansh Gupta, Sankalan Pal Chowdhury, Vil em Zouhar, Donya Rooein, Mrinmaya Sachan

Outline

This paper evaluates the feasibility of a large-scale language model (LLM) in an educational environment using multiple languages (English, Mandarin, Hindi, Arabic, German, Persian, Telugu, Ukrainian, and Czech). The LLM's performance was measured on four educational tasks: identifying student misconceptions, providing personalized feedback, interactive tutoring, and grading translations. Results revealed that LLM performance was primarily correlated with the amount of language included in the training data. Performance was particularly poor for low-resource languages, with performance degradation occurring more frequently than in English.

Takeaways, Limitations

Takeaways: By presenting empirical research evaluating the educational applicability of LLM in various languages, including low-resource languages, we emphasize the importance of validating the performance of LLM in the relevant language before applying it to educational settings. We found that LLM performance is significantly affected by the linguistic composition of the training data.
Limitations: This study evaluated a specific language and task, limiting its generalizability to other languages and tasks. Furthermore, it lacks an in-depth analysis of the causes of LLM performance degradation. The lack of a comparative analysis of various LLM models is another limitation.
👍