This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
Fluent but Unfeeling: The Emotional Blind Spots of Language Models
Created by
Haebom
Author
Bangzhao Shu, Isha Joshi, Melissa Karnaze, Anh C. Pham, Ishita Kakkar, Sindhu Kothe, Arpine Hovasapian, Mai ElSherief
Outline
This paper explores the sentiment recognition capabilities of large-scale language models (LLMs). Unlike previous studies that categorize sentiments into a limited set of categories, we present a new benchmark dataset, EXPRESS, consisting of 251 fine-grained self-reported sentiment labels collected from the Reddit community. We systematically evaluate several LLMs under various prompt settings and demonstrate their difficulty in accurately predicting sentiments consistent with human self-reports. Qualitative analysis reveals that while some LLMs generate sentiment terms consistent with existing emotion theory and definitions, they fail to capture contextual cues as effectively as human self-reports. Therefore, this study highlights the limitations of LLMs in fine-grained sentiment consistency and provides insights for future research aimed at improving contextual understanding.
Takeaways, Limitations
•
Takeaways:
◦
Presenting a new benchmark dataset (EXPRESS) for fine-grained emotion recognition.
◦
A systematic evaluation of the LLM's ability to predict fine-grained emotions and its limitations.
◦
Suggesting research directions to improve LLM students' ability to understand context.
•
Limitations:
◦
Limited generalizability of datasets based on Reddit data.
◦
The potential for reduced accuracy due to the subjectivity of self-reported emotions
◦
Difficulties in generalizing due to limitations in the types and versions of LLMs used in the evaluation.