Share
Sign In
공부 내용
Large Language Models for Information Retrieval: A Survey
Y
yeji Kim
👍
Introduction
retrieval
upstream - query reformulation
downstream - reranking and reading
reranking
only on a limited set of relevant documents
personalization, diversification
Background
Info retrieval
relevance estimation - lexical similarity between the query and document vectors.
Components
Query rewriter
Retriever
Reranker - fine-grained reordering
Reader - comprehend real-time user intent and generate dynamic responses
Search agent
LLMs
Query rewriter
Rewriting scenario
generate synonyms and related concepts, enhancing queries to cover a broader range of relevant documents
Adhoc retrieval
adding related terms and clarifying ambiguous queries
Conversational search
e.g. CONVERSER - generate synthetic passage-dialogue pairs
Rewriting knowledge
LLM-only methods
HyDE
GFF - generate, filter, fuse
Corpus-enhanced LLM-based methods - 나 얘네 방법론들을 좀 알아야 될 것 같은데
Late fusion of LLM-based re-writing and pseudo relevance feedback (PRF)
QUILL, CAR, LameR, InteR
Combining retrieved relevant documents in the prompts of LLMs
GRM, RASE
Rewriting approaches
Prompting
0 shot, few shot, CoT(chain of thought)
Fine-tuning
Supervised fine-tuning
BEQUE, ChatGLM, Chat-GLM2.0, Baichuan, Qwen
Knowledge distillation
Limitations
Concept drifts
Negative correlation between retrieval performance and expansion effects
downstream ranking model이 안 좋을 때는 expansion 효과가 좋지만, 좋을 때는 expansion이 되려 성능을 떨어뜨림
Retriever
Leveraging LLMs to generate search data
Search data refinement
Reranker
LLM as...
supervised reranker
unsupervised reranker
data augmentation
Reader
Search agent
얘네 어떻게 동작하는지 알아야 할 듯
Static agent
lambda
SeeKeR
WebAgent
WebGLM
Dynamic agent
WebGPT - special tokens for querying, scrolling through rankings, and quoting references on search engines.
Future Direction
Query rewriter
Subscribe to '아무튼-작업일지'
Welcome to '아무튼-작업일지'!
By subscribing to my site, you'll be the first to receive notifications and emails about the latest updates, including new posts.
Join SlashPage and subscribe to '아무튼-작업일지'!
Subscribe
👍
Other posts in '공부 내용'See all
yeji Kim
자동화 툴 - 공부해보기
https://www.youtube.com/watch?v=ywH7JIK34Tg
yeji Kim
랜딩 페이지 만들기
https://www.youtube.com/watch?v=jnRqq2JEdxI https://www.youtube.com/watch?v=_6k1k7NtZRI Framer ai 프롬프트 사용법
yeji Kim
패턴러닝
패턴 러닝 !! template matching ... 궁금한 것들 패턴 매칭 ... 아 이거 진짜 간단한 건데. 어 ... 이거 LLM 못 시키나. 결국 LLM을 만들어야 ... 아니 하드웨어를 만들어야 ... 똑똑한 친구를 어떻게 현업에 적용할지 ...! ...!!...!! OCR한 결과, 문법 교정한 결과 제공 사용자에게 표시해달라 LLM한테 나눠달라 고쳐달라 오 이 피드백 과정을 좀 자동화할 수 있나 좋은데 오 개좋은데? 이거를... 이거를 헤헷 요렇게 피드백하는 것만이라도 왓다갓다할 . 수있게 하고 싶다!! 헤헤헿 이걸 하려면 어떤 함수들이 필요한지 !!!