Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Fluent but Unfeeling: The Emotional Blind Spots of Language Models

Created by
  • Haebom

Author

Bangzhao Shu, Isha Joshi, Melissa Karnaze, Anh C. Pham, Ishita Kakkar, Sindhu Kothe, Arpine Hovasapian, Mai ElSherief

Outline

This paper explores the sentiment recognition capabilities of large-scale language models (LLMs). Unlike previous studies that categorize sentiments into a limited set of categories, we present a new benchmark dataset, EXPRESS, consisting of 251 fine-grained self-reported sentiment labels collected from the Reddit community. We systematically evaluate several LLMs under various prompt settings and demonstrate their difficulty in accurately predicting sentiments consistent with human self-reports. Qualitative analysis reveals that while some LLMs generate sentiment terms consistent with existing emotion theory and definitions, they fail to capture contextual cues as effectively as human self-reports. Therefore, this study highlights the limitations of LLMs in fine-grained sentiment consistency and provides insights for future research aimed at improving contextual understanding.

Takeaways, Limitations

Takeaways:
Presenting a new benchmark dataset (EXPRESS) for fine-grained emotion recognition.
A systematic evaluation of the LLM's ability to predict fine-grained emotions and its limitations.
Suggesting research directions to improve LLM students' ability to understand context.
Limitations:
Limited generalizability of datasets based on Reddit data.
The potential for reduced accuracy due to the subjectivity of self-reported emotions
Difficulties in generalizing due to limitations in the types and versions of LLMs used in the evaluation.
👍