Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

KV Cache Steering for Controlling Frozen LLMs

Created by
  • Haebom

Author

Max Belitsky, Dawid J. Kopiczko, Michael Dorkenwald, M. Jehanzeb Mirza, James R. Glass, Cees G.M. Snoek, Yuki M. Asano

Cache Steering

Outline

This paper proposes "cache steering," a lightweight technique for implicitly steering language models. This technique utilizes one-shot interventions directly applied to the key-value cache. To demonstrate the effectiveness of cache steering, we apply it to induce "chain-of-thought" inference in a small language model. By constructing steering vectors from reasoning traces obtained from a teacher model (e.g., GPT-4o) or existing human annotations, we shift model behavior toward more explicit, multi-step inference without fine-tuning or prompt modification. Experimental results on various inference benchmarks demonstrate that cache steering improves both the qualitative structure of model inference and quantitative task performance. Additional experiments demonstrate its applicability to large models and demonstrate additional benefits on challenging datasets such as GPQA and MATH. Compared to existing activation steering techniques that require continuous intervention, one-shot cache steering offers significant advantages in inference latency, hyperparameter stability, and ease of integration with existing inference APIs. Beyond simple inference guidance, cache steering enables the transfer of controllable inference styles, such as step-by-step, causal, and analogical inference styles, making it a practical tool for behavioral-level guidance of language models.

Takeaways, Limitations

Takeaways:
We present a novel methodology to improve the inference ability of language models without fine-tuning or prompt modification.
It offers practical advantages over existing activation steering techniques, including reduced inference latency, hyperparameter stability, and ease of API integration.
Suggesting the possibility of modulating the behavioral level of a language model through controllable inference style transfer.
Proven applicable to both small and large language models.
Limitations:
Specific Limitations is not stated in the paper.
👍