Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Iterative Utility Judgment Framework via LLMs Inspired by Relevance in Philosophy

Created by
  • Haebom

Author

Hengran Zhang, Keping Bi, Jiafeng Guo, Xueqi Cheng

Outline

This paper focuses on relevance and usefulness, two metrics for evaluating the effectiveness of information retrieval systems, and emphasizes the importance of prioritizing highly useful results in Augmented Retrieval Generation (RAG) using large-scale language models (LLMs) with limited input bandwidth. We link RAG's three core components—relevance rankings derived from retrieval models, usefulness judgments, and answer generation—with Schutz's relevance philosophy framework, demonstrating that each component reflects three cognitive levels that enhance each other. Building on this framework, we propose an iterative usefulness judgment framework (ITEM) that improves each stage of RAG. Experiments on the TREC DL, WebAP, GTI-NQ, and NQ datasets demonstrate that ITEM significantly improves usefulness judgments, rankings, and answer generation compared to baseline models.

Takeaways, Limitations

Takeaways:
A new framework (ITEM) for improving the performance of the RAG system is presented.
Proposing an effective method for information retrieval and answer generation that considers both relevance and usefulness.
Analysis and improvement of RAG components using Schutz's relevance philosophy framework.
Verify the effectiveness of ITEM through experiments using various datasets.
Limitations:
Further research is needed to determine the generalizability of the proposed ITEM framework.
Additional experiments are needed on different types of questions and datasets.
There is a possibility that the subjectivity of usefulness judgments may not be completely eliminated.
Optimization potential for specific LLMs and analysis of performance changes when applying different LLMs are required.
👍