Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark

Created by
  • Haebom

Author

Kangwei Liu, Siyuan Cheng, Bozhong Tian, Xiaozhuan Liang, Yuyang Yin, Meng Han, Ningyu Zhang, Bryan Hooi, Xi Chen, Shumin Deng

Outline

This paper presents a comprehensive, expertly annotated benchmark for Chinese language harmful content detection. To address the challenges of existing harmful content detection resources being focused on English, while Chinese language datasets are scarce and limited in scope, we develop a benchmark that incorporates six representative categories of real-world data. Through the annotation process, we establish an expert knowledge rule base to support Chinese language harmful content detection in LLMs. We then propose a knowledge augmentation baseline model that integrates human-annotated knowledge rules with the implicit knowledge of LLMs, enabling a small model to achieve performance comparable to state-of-the-art LLMs. Code and data are available at https://github.com/zjunlp/ChineseHarm-bench .

Takeaways, Limitations

Takeaways:
Contributing to solving the problem of data shortage in the field of Chinese harmful content detection.
Providing large-scale, cross-category benchmarks based on real-world data.
Suggesting the possibility of improving LLMs performance by leveraging expert knowledge rule base.
Suggesting the possibility of improving the performance of small-scale models through knowledge augmentation techniques.
We expect that the published code and data will stimulate follow-up research.
Limitations:
The benchmark categories may be limited to six.
May not fully reflect the diversity of harmful content in the real world.
Further research is needed on the generalization performance of the proposed knowledge augmentation technique.
Limitations on the generalizability of the study results, which are limited to a specific language (Chinese).
👍