This paper proposes "cache steering," a lightweight technique for implicitly steering language models. This technique utilizes one-shot interventions directly applied to the key-value cache. To demonstrate the effectiveness of cache steering, we apply it to induce "chain-of-thought" inference in a small language model. By constructing steering vectors from reasoning traces obtained from a teacher model (e.g., GPT-4o) or existing human annotations, we shift model behavior toward more explicit, multi-step inference without fine-tuning or prompt modification. Experimental results on various inference benchmarks demonstrate that cache steering improves both the qualitative structure of model inference and quantitative task performance. Additional experiments demonstrate its applicability to large models and demonstrate additional benefits on challenging datasets such as GPQA and MATH. Compared to existing activation steering techniques that require continuous intervention, one-shot cache steering offers significant advantages in inference latency, hyperparameter stability, and ease of integration with existing inference APIs. Beyond simple inference guidance, cache steering enables the transfer of controllable inference styles, such as step-by-step, causal, and analogical inference styles, making it a practical tool for behavioral-level guidance of language models.