Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning

Created by
  • Haebom

Author

Matteo Fuoli, Weihang Huang, Jeannette Littlemore, Sarah Turner, Ellen Wilding

Outline

This study explores the potential of large-scale language models (LLMs) for automatically identifying metaphors in discourse. We compare three methods, including retrieval-augmented generation (RAG), prompt engineering, and fine-tuning. The results demonstrate that a state-of-the-art closed-loop LLM can achieve high accuracy, with fine-tuning achieving a median F1 score of 0.79. Comparing the results between human and LLM models reveals that most of the discrepancies are systematic and reflect well-known gray areas and conceptual challenges in metaphor theory.

Takeaways, Limitations

Metaphor identification can be partially automated using LLM.
The LLM can serve as a test bed for the development and refinement of metaphor identification protocols and underlying theories.
Fine tuning showed the best performance.
The differences between LLM and human outcomes reflect the complexity of metaphor theory.
This study may be limited to specific LLMs and datasets.
Further research is needed into the gray areas of metaphor identification.
👍