[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Metaphor and Large Language Models: When Surface Features Matter More than Deep Understanding

Created by
  • Haebom

Author

Elisa Sanchez-Bayona, Rodrigo Agerri

Outline

This paper presents a comprehensive evaluation of the metaphor interpretation ability of large-scale language models (LLMs) across a variety of datasets, tasks, and prompt settings. While previous studies have been limited to single-dataset evaluations and specific task settings, often using artificial data through lexical substitution, this study conducts extensive experiments focusing on natural language inference (NLI) and question answering (QA) tasks using a variety of publicly available datasets with inference and metaphor annotations. The results show that the performance of LLMs is more influenced by features such as lexical redundancy and sentence length than by metaphorical content. This suggests that any novel ability of LLMs to understand metaphorical language is the result of a combination of surface features, contextual learning, and linguistic knowledge. This study highlights the need for a more realistic evaluation framework for metaphor interpretation tasks, and provides important insights into the capabilities and limitations of LLMs in processing metaphorical language. The data and code are publicly available.

Takeaways, Limitations

Takeaways: Provides a comprehensive and broad assessment of LLM's metaphor interpretation skills. Reveals LLM's tendency for metaphor comprehension skills to rely on surface features. Suggests the need for a more realistic metaphor interpretation assessment framework. Provides publicly available data and code.
Limitations: Despite the diversity of the datasets used for evaluation, they may not cover all aspects of metaphorical language use in the real world. A more sophisticated analysis of LLM’s ability to understand metaphors is needed. LLM’s performance biases for specific types of metaphors or vocabulary should be investigated in more detail.
👍