Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Unified Neural Backdoor Removal with Only Few Clean Samples through Unlearning and Relearning

Created by
  • Haebom

Author

Nay Myat Min, Long H. Pham, Jun Sun

Outline

In this paper, we propose a novel method, UnLearn and ReLearn (ULRL), that can defend against backdoor attacks even with a limited amount of normal data. ULRL uses a two-step approach to identify and retrain neurons that are oversensitive to backdoor triggers. In the first step, Unlearning, we intentionally maximize the loss of the network on a small set of normal data to find neurons that are sensitive to backdoor triggers. In the second step, Relearning, we retrain these suspicious neurons using target re-initialization and cosine similarity regularization to neutralize the backdoor influence and maintain the model performance on normal data. Through extensive experiments on 12 types of backdoor attacks on various datasets such as CIFAR-10, CIFAR-100, GTSRB, Tiny-ImageNet, and various architectures such as PreAct-ResNet18, VGG19-BN, and ViT-B-16, we show that ULRL can significantly reduce the attack success rate without compromising the normal accuracy. In particular, we have verified that it is effective even when only 1% of normal data is used for defense.

Takeaways, Limitations

Takeaways:
We demonstrate that effective backdoor attack defense is possible even with limited normal data.
It shows high performance on various datasets and architectures, suggesting excellent generalization performance.
It can contribute to strengthening cyber security by suggesting new defense strategies against backdoor attacks.
Limitations:
There is a lack of analysis of the computational cost and complexity of the proposed method.
Further research is needed on robustness against different types of backdoor attacks.
Further validation of applicability and effectiveness in real-world environments is needed.
👍