Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

R-Capsule: Compressing High-Level Plans for Efficient Large Language Model Reasoning

Created by
  • Haebom

Author

Hongyu Shan, Mingyang Song, Chang Dai, Di Liang, Han Chen

Outline

Chain-of-Thought (CoT) prompting helps large-scale language models (LLMs) solve complex problems through step-by-step reasoning. However, CoT suffers from excessive representations, leading to increased latency and memory usage, and early errors can propagate throughout long chains. In this paper, we propose the Reasoning Capsule (R-Capsule) framework, which aims to combine the efficiency of latent reasoning with the transparency of explicit CoT. The core idea is to compress high-level plans into a small set of learned latent tokens (Reasoning Capsules) while keeping execution steps lightweight or explicit. This hybrid approach, inspired by the Information Bottleneck (IB) principle, encourages capsules to be minimal yet sufficient for the task. Minimization is encouraged through a low-capacity bottleneck to improve efficiency. Sufficiency is encouraged through a dual objective: a primary task loss for answer accuracy, and an auxiliary plan reconstruction loss that encourages capsules to faithfully represent the original text plan. The reconstruction objective helps ground the latent space to improve interpretability and reduce the use of uninformative shortcuts. This framework balances efficiency, accuracy, and interpretability, reducing the visible token footprint of inference while maintaining or improving accuracy on complex benchmarks.

Takeaways, Limitations

Takeaways:
Combining the efficiency of latent inference with the transparency of explicit CoT.
Achieving a balance between efficiency, accuracy, and interpretability with Reasoning Capsule.
Reducing the token footprint of inference while maintaining or improving accuracy on complex benchmarks.
Limitations:
No specific mention of Limitations in the paper (although there may be potential interpretation difficulties due to latent space learning).
👍