Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval

Created by
  • Haebom

Author

Meixiu Long, Duolin Sun, Dan Yang, Junjie Wang, Yue Shen, Jian Wang, Peng Wei, Jinjie Gu, Jiahai Wang

Outline

This paper addresses the limitations of augmented generative search models that achieve robust performance in knowledge-intensive tasks where query-document relevance can be identified through direct lexical or semantic matching. Existing retrieval systems struggle to capture many real-world queries involving abstract reasoning, analogical thinking, or multi-step reasoning. To address this challenge, we present DIVER , a retrieval pipeline designed for inference-intensive information retrieval . DIVER consists of four components: document processing to improve input quality, LLM-based query expansion via iterative document interaction, an inference-enhanced searcher fine-tuned on synthetic multidomain data using hard negatives, and a point-wise reranker that combines LLM-assigned usefulness scores with retrieval scores. On the BRIGHT benchmark, DIVER consistently outperforms competing inference-aware models, achieving state-of-the-art nDCG@10 scores of 41.6 and 28.9 for the original query. These results demonstrate the effectiveness of inference-aware retrieval strategies on complex, real-world tasks. The code and retrieval model will be made public shortly.

Takeaways, Limitations

Takeaways: We present DIVER, an effective new search pipeline for inference-intensive information retrieval. We achieve state-of-the-art performance on the BRIGHT benchmark. We demonstrate the effectiveness of LLM-based query expansion and inference-enhanced searchers. We highlight the importance of inference-aware search strategies in complex real-world tasks.
Limitations: The code and search model are not yet publicly available. Performance verification on benchmarks other than the BRIGHT benchmark is required. Generalization performance evaluations for various types of inference tasks are also needed.
👍