Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-$k$

Created by
  • Haebom

Author

Chihiro Taguchi, Seiji Maekawa, Nikita Bhutani

Outline

Retrieval-augmented generation (RAG) and long-context language models (LCLMs) address the contextual limitations of LLMs, but determining the optimal external context to retrieve remains an unresolved challenge. Adaptive-k retrieval is a simple and effective single-pass method that adaptively selects the number of phrases based on the distribution of similarity scores between the query and candidate phrases. It does not require model fine-tuning, additional LLM inference, or modifications to the existing search-reader pipeline.

Takeaways, Limitations

Adaptive-$k$ retrieval performs on par with or better than the fixed-$k$ baseline, using up to 10x fewer tokens than the full context input and retrieving 70% of the relevant phrases.
Improves accuracy across five LCLM and two embedding models.
As a single-pass method, it addresses the shortcomings of existing adaptive methods that rely on iterative LLM prompting.
It is especially effective in aggregation QA.
👍