Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Evaluating Retrieval-Augmented Generation vs. Long-Context Input for Clinical Reasoning over EHRs

Created by
  • Haebom

Author

Skatje Myers, Dmitriy Dligach, Timothy A. Miller, Samantha Barr, Yanjun Gao, Matthew Churpek, Anoop Mayampurath, Majid Afshar

Outline

This paper presents a study that leverages large-scale language models (LLMs) and search-augmented generation (RAG) techniques to address the challenges of long, noisy, and redundant texts in electronic health records (EHRs). To address the limited context window of existing LLMs, we use RAG to retrieve task-relevant passages from the entire EHR and apply it to three clinical tasks: imaging procedure extraction, antibiotic schedule generation, and major diagnosis identification. Using real-world inpatient EHR data, we evaluate three state-of-the-art LLMs with varying amounts of context. We demonstrate that RAG performs similarly or better than methods using only recent records, achieving comparable performance to full context with significantly fewer input tokens. This suggests that RAG remains a competitive and efficient approach even as new models capable of handling longer texts emerge.

Takeaways, Limitations

Takeaways:
We demonstrate that the RAG technique can effectively solve the long text problem in EHRs.
We present three clinical challenges that can be applied across multiple healthcare systems with minimal effort.
Overcoming the context window limitations of LLM and presenting the possibility of efficient information extraction and inference.
It suggests that the RAG technique can maintain its competitiveness in future more advanced LLMs.
Limitations:
Limitations on the generalizability of EHR data used to specific healthcare systems.
Additional research is needed on various diseases and patient characteristics.
Further analysis is needed on the search strategy of the RAG technique and the performance changes according to LLM selection.
Additional validation and safety evaluation are needed for actual clinical application.
👍