Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Test-time Corpus Feedback: From Retrieval to RAG

Created by
  • Haebom

Author

Mandeep Rathee, V Venktesh, Sean MacAvaney, Avishek Anand

Outline

This paper discusses Augmented Retrieval Generation (RAG), a standard framework that combines large-scale language models (LLMs) with document retrieval from external corpora for knowledge-intensive natural language processing tasks. Most RAG pipelines treat retrieval and inference as independent components, with a static design that retrieves documents once and then generates answers without further interaction. This design limits performance for complex tasks requiring iterative evidence gathering or high-precision retrieval. This paper reviews recent research in information retrieval (IR) and NLP to address this gap by introducing adaptive retrieval and ranking methods that incorporate feedback. We structurally outline improved retrieval and ranking mechanisms based on this feedback, classifying feedback signals based on their source and role in query, retrieved context, or document pool enhancement. We aim to bridge the gap between IR and NLP perspectives, emphasizing retrieval as a dynamic and learnable component of an end-to-end RAG system.

Takeaways, Limitations

Takeaways: This paper highlights the importance of adaptive search and ranking methods for improving the performance of RAG systems. It integrates research in IR and NLP to suggest future directions for RAG systems. It systematically organizes various feedback-based search and ranking techniques, providing a foundation for future research.
Limitations: This paper is a survey review of existing research and therefore does not present a new methodology. Furthermore, it may lack an in-depth analysis of the relative importance or effectiveness of various feedback signals. Performance evaluation results for actual RAG systems are not included.
👍