Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark

Created by
  • Haebom

Author

Kangwei Liu, Siyuan Cheng, Bozhong Tian, Xiaozhuan Liang, Yuyang Yin, Meng Han, Ningyu Zhang, Bryan Hooi, Xi Chen, Shumin Deng

Outline

This paper presents a comprehensive, expertly annotated benchmark for Chinese language harmful content detection. Given that existing harmful content detection resources are focused on English, while Chinese language datasets are scarce and limited in scope, we develop a benchmark covering six representative categories of real-world data. Through the annotation process, we generate an expert knowledge rule base to support Chinese harmful content detection in LLMs. We then propose a knowledge augmentation baseline model that integrates human-annotated knowledge rules with the LLMs' implicit knowledge, enabling a small model to achieve performance comparable to state-of-the-art LLMs. Code and data are available at https://github.com/zjunlp/ChineseHarm-bench .

Takeaways, Limitations

Takeaways:
Contributing to solving the problem of data shortage in the field of Chinese harmful content detection.
Accelerating research advancements by providing large-scale benchmarks based on real-world data.
Improving LLMs performance by presenting expert knowledge rule-based and knowledge augmentation reference models.
Suggesting the possibility of improving the performance of small-scale models.
Limitations:
The benchmark categories may be limited to six.
Although based on real data, further review is needed for data bias and generalizability.
Further research is needed on the generalization performance of the proposed knowledge augmentation baseline model.
Further research is needed on different types of harmful content and different Chinese dialects.
👍