This paper proposes a system called Krul to solve the problem of efficient state restoration in multi-round conversations of large-scale language models (LLMs). To overcome the limitation of existing KV cache compression methods that apply the same compression method to all conversations, Krul dynamically selects a compression strategy by considering attention pattern similarity across conversations. Key innovations include predictive compression strategy selection, token-wise heterogeneous attention similarity estimation, and a bubble-free restoration scheduler. Experimental results show that Krul reduces TTFT by 1.5x and 2.68x, and KV cache storage by 1.33x and 2.35x, respectively, compared to the best-performing existing methods, while maintaining the same generation quality.