Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

AI-Powered Detection of Inappropriate Language in Medical School Curricula

Created by
  • Haebom

Author

Chiman Salavati, Shannon Song, Scott A. Hale, Roberto E. Montenegro, Shiri Dori-Hacohen, Fabricio Murai

Outline

This paper evaluates the performance of a small language model (SLM) and a pre-trained large language model (LLM) for automatically identifying inappropriate language (IUL) in medical education materials. Using a dataset of approximately 500 documents (over 12,000 pages), we compared various SLM models, including an IUL general classifier, a subcategory-specific binary classifier, a multi-label classifier, and a hierarchical pipeline, as well as an LLM (Llama-3 8B and 70B) with several prompt variations. The results showed that the SLM significantly outperformed the LLM using carefully constructed shots, and in particular, the subcategory-specific binary classifier, trained on negative examples in sections devoid of inappropriate language, performed best.

Takeaways, Limitations

Takeaways:
We show that SLM is more effective than LLM in automatically identifying inappropriate language use in medical education materials.
In particular, a subcategory binary classifier trained using speech examples without inappropriate language use showed high performance.
Suggesting the possibility of contributing to improving the quality and eliminating bias in medical education materials through an SLM-based automated system.
Limitations:
The size of the dataset used in the study may be relatively small.
It is possible that we did not cover all types of inappropriate language use.
Further research and validation are needed for application to actual medical education settings.
Lack of in-depth analysis of the causes of LLM performance degradation.
👍