In this paper, we propose a novel penalty, the LZ penalty, designed to reduce degenerate repetitions in autoregressive language models. Based on the code length of the LZ77 compression algorithm, the penalty can be interpreted as sampling from the residual distribution after removing information with high compression ratio from the perspective of prediction-compression duality. Experimental results show that the LZ penalty prevents degenerate repetitions without performance degradation even when using greedy decoding (temperature 0) on a state-of-the-art open-source inference model. In contrast, the existing frequency penalty and repetition penalty show a degenerate repetition rate of up to 4%.