[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Why Does New Knowledge Create Messy Ripple Effects in LLMs?

Created by
  • Haebom

Author

Jiaxin Qin, Zixuan Zhang, Manling Li, Pengfei Yu, Heng Ji

Outline

This paper focuses on the study of keeping the knowledge of language models (LMs) accurate and up-to-date through post-training knowledge editing (KE). In particular, we address the problem of ensuring that LMs correctly answer logically related knowledge after knowledge editing, that is, properly handling ripple effects. We analyze why existing KE methods still generate confusing ripple effects and propose a metric called GradSim, which measures the cosine similarity between the gradient of the original facts and the related knowledge. We observe a strong positive correlation between ripple effect performance and GradSim across a variety of LMs, KE methods, and evaluation metrics, and show that three counterintuitive failure cases, such as negation, excessive ripple effects, and multilingualism, are associated with low GradSim. In conclusion, we verify that GradSim is an effective metric for indicating when knowledge ripples in LMs.

Takeaways, Limitations

Takeaways: GradSim suggests that LM is a useful metric for predicting and analyzing spillover effects after knowledge editing. This can contribute to improving the performance of KE methods and solving spillover problems.
Limitations: GradSim suggests correlations, but does not fully prove causality. In addition, additional analysis of other types of failure cases beyond the three presented is needed. Further research is needed on the computational cost and generalization performance of GradSim.
👍