This paper focuses on the study of keeping the knowledge of language models (LMs) accurate and up-to-date through post-training knowledge editing (KE). In particular, we address the problem of ensuring that LMs correctly answer logically related knowledge after knowledge editing, that is, properly handling ripple effects. We analyze why existing KE methods still generate confusing ripple effects and propose a metric called GradSim, which measures the cosine similarity between the gradient of the original facts and the related knowledge. We observe a strong positive correlation between ripple effect performance and GradSim across a variety of LMs, KE methods, and evaluation metrics, and show that three counterintuitive failure cases, such as negation, excessive ripple effects, and multilingualism, are associated with low GradSim. In conclusion, we verify that GradSim is an effective metric for indicating when knowledge ripples in LMs.