Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Reference-Aligned Retrieval-Augmented Question Answering over Heterogeneous Proprietary Documents

Created by
  • Haebom

Author

Nayoung Choi, Grace Byun, Andrew Chung, Ellie S. Paek, Shinsun Lee, Jinho D. Choi

Outline

This paper proposes a Retrieval-Augmented Generation (RAG)-based question-answering (QA) system to address the challenges of information access due to the vast volume and unstructured nature of internal corporate documents. Using crash test documents from the automotive industry as an example, we focus on processing diverse data types, maintaining data confidentiality, and ensuring traceability between generated answers and the original documents. The proposed system consists of a data pipeline that transforms various document types into a structured corpus and QA pairs, an on-premises privacy-preserving architecture, and a lightweight reference matcher that links answers to supporting content. Application to the automotive industry demonstrates improvements in factual accuracy, informativeness, and usability compared to existing systems.

Takeaways, Limitations

Takeaways:
Suggesting the possibility of resolving internal corporate document retrieval and information access issues by utilizing a RAG-based QA system.
Presenting a method for effectively processing various types of data (multimodal).
A proposal to establish a QA system while maintaining the confidentiality of internal corporate data.
Improve reliability by ensuring traceability of generated answers.
High applicability not only to the automotive industry but also to other industries
Limitations:
The performance evaluation of the proposed system is limited to a specific industry (automotive) and a limited dataset. Further research is needed to determine its generalizability to other industries and datasets.
Lack of analysis of the costs and resources required to build and operate the system.
Further review is needed to determine the objectivity and reliability of assessments using LLM judges.
Lack of consideration for system scalability and potential performance degradation for large datasets.
👍