Chain-of-Thought (CoT) prompting helps large-scale language models (LLMs) solve complex problems through step-by-step reasoning. However, CoT suffers from excessive representations, leading to increased latency and memory usage, and early errors can propagate throughout long chains. In this paper, we propose the Reasoning Capsule (R-Capsule) framework, which aims to combine the efficiency of latent reasoning with the transparency of explicit CoT. The core idea is to compress high-level plans into a small set of learned latent tokens (Reasoning Capsules) while keeping execution steps lightweight or explicit. This hybrid approach, inspired by the Information Bottleneck (IB) principle, encourages capsules to be minimal yet sufficient for the task. Minimization is encouraged through a low-capacity bottleneck to improve efficiency. Sufficiency is encouraged through a dual objective: a primary task loss for answer accuracy, and an auxiliary plan reconstruction loss that encourages capsules to faithfully represent the original text plan. The reconstruction objective helps ground the latent space to improve interpretability and reduce the use of uninformative shortcuts. This framework balances efficiency, accuracy, and interpretability, reducing the visible token footprint of inference while maintaining or improving accuracy on complex benchmarks.