Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

A Survey on Transformer Context Extension: Approaches and Evaluation

Created by
  • Haebom

Author

Yijun Liu, Jinzheng Yu, Yang Xu, Zhongyang Li, Qingfu Zhu

Outline

This paper addresses the long-text processing task of Transformer-based large-scale language models (LLMs). LLMs perform well on short-text tasks, but their performance deteriorates in long-text contexts. To address this issue, we systematically review recent studies and propose a classification scheme that categorizes them into four types: positional encoding, context compression, retrieval augmentation, and attention patterns. In addition, we organize relevant data, tasks, and metrics based on existing long-text context benchmarks, and focus on long-text context evaluation, summarize unresolved issues, and provide perspectives on future development directions.

Takeaways, Limitations

Takeaways:
Provides a systematic review and classification of LLM long-form assignments
Introducing and categorizing various approaches for long-text context processing (positional encoding, context compression, search augmentation, attention patterns)
Organizing relevant data, tasks, and metrics for long-term contextual assessment
Suggesting future research directions
Limitations:
This paper focuses on the survey and classification of existing studies and does not present a new methodology.
It is possible that the proposed classification scheme may not comprehensively cover all long-form processing approaches.
Absence of clear discussion on the precise definition and scope of long-form context processing.
👍