In the contextual Markov Decision Processes (CMDP) environment, we propose an information-theoretic summarization approach that leverages a large-scale language model (LLM) to compress high-dimensional/unstructured context into a low-dimensional semantic summary. This method augments state by reducing redundancy while preserving crucial decision-making clues. Based on the concept of approximate context sufficiency, we provide a first-of-its-kind regret bound and delay-entropy tradeoff characterization for CMDP. It outperforms existing methods on various benchmarks, improving reward, success rate, and sample efficiency while reducing latency and memory usage.