[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs

Created by
  • Haebom

Author

Tamim Al Mahmud, Najeeb Jebreel, Josep Domingo-Ferrer, David Sanchez

Outline

This paper presents DP2Unlearning, a novel unlearning framework to solve the problem of remembering and leaking personal information or copyright information included in the training data of large-scale language models (LLMs). Existing retraining methods have the limitations of excessive cost, and approximate unlearning has the limitation of insufficient forgetting guarantee. DP2Unlearning applies ε-differential privacy (DP) to the training data, and uses the trained LLM to enable efficient unlearning along with the guarantee of preventing information leakage according to the selected ε. Experimental results show that DP2Unlearning shows similar performance to the retraining method, performs unlearning at about half the cost, and outperforms the approximate unlearning method in terms of model usefulness preservation and target information forgetting.

Takeaways, Limitations

Takeaways:
Presenting an efficient and secure solution to the privacy and copyright information leakage problems of large-scale language models.
Achieving similar performance of unlearning at a much lower cost than retraining methods.
Superior model utility preservation and target information forgetting performance over approximate unlearning methods.
Providing formal forgetting guarantees based on differential privacy (DP).
Limitations:
Possible degradation of model performance due to DP application. (Although not mentioned in the paper, DP application can generally cause performance degradation.)
There is a trade-off between performance and privacy level depending on the ε value setting.
Further analysis is needed on the computational cost and complexity required for practical applications of DP2Unlearning.
👍