Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LinguaSafe: A Comprehensive Multilingual Safety Benchmark for Large Language Models

Created by
  • Haebom

Author

Zhiyuan Ning, Tianle Gu, Jiaxin Song, Shixin Hong, Lingyu Li, Huacan Liu, Jie Li, Yixu Wang, Meng Lingyu, Yan Teng, Yingchun Wang

Outline

This paper focuses on ensuring the safety of large-scale language models (LLMs) across diverse linguistic and cultural contexts. To address the lack of comprehensive assessments and diverse data for existing multilingual LLM safety assessments, we present LinguaSafe, a multilingual safety benchmark comprising 45,000 items across 12 languages, from Hungarian to Malay. Built by combining translations, variant translations, and source data, LinguaSafe provides a multidimensional and granular assessment framework that includes direct and indirect safety assessments, as well as an additional assessment of oversensitivity. We demonstrate that safety and usability assessment results vary significantly across languages and domains, highlighting the importance of multilingual LLM safety assessment. The dataset and code are openly distributed to support further research.

Takeaways, Limitations

Takeaways:
We provide LinguaSafe, a comprehensive benchmark for assessing the safety of multilingual LLMs.
Addresses existing linguistic biases by including multiple languages (from Hungarian to Malay).
Provides a multidimensional assessment framework including direct and indirect safety assessments.
The safety assessment results of multilingual LLMs show significant differences across languages and domains.
The published dataset and code lay the foundation for future multilingual LLM safety research.
Limitations:
The size and language coverage of the LinguaSafe dataset can be further expanded.
Additional validation of the objectivity and reliability of the evaluation framework may be required.
It is possible that biases toward certain languages or cultural contexts still exist.
Adaptability to new LLM architectures and features needs further study.
👍