Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation

Created by
  • Haebom

Author

Jirui Qi, Raquel Fernandez , Arianna Bisazza

Outline

This paper studies how effectively a Large-Scale Language Model (LLM) leverages the context of diverse languages in Augmented Retrieval Generation (RAG) systems. Specifically, we evaluate the LLM's ability to leverage relevant phrases in languages other than the query, respond in the expected language, and focus on relevant phrases even when presented with "interrupting" phrases in multiple languages. Experiments using four LLMs and three QA datasets spanning 48 languages reveal that while LLM excels at extracting information from phrases in languages other than the query, it struggles to generate complete answers in the correct language. Furthermore, we find that interrupting phrases negatively impact answer quality regardless of language, with interrupting phrases in the query language having a greater impact.

Takeaways, Limitations

LLM is relatively good at extracting relevant information from queries and phrases in other languages.
LLMs struggle to produce answers in precise language.
Interrupting phrases negatively impact response quality, regardless of language.
Interfering phrases in the query language have a greater impact on answer quality.
The LLM enhanced my understanding of how context is leveraged in the mRAG system.
Suggests future directions for improvement of the RAG system.
The study was limited to four LLM and three QA datasets, which may limit generalizability.
Results may depend on specific models and datasets.
Further research is needed that takes into account various contextual factors (e.g., passage length, complexity, etc.).
👍