Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Forget the Data and Fine-Tuning! Just Fold the Network to Compress

Created by
  • Haebom

Author

Dong Wang, Haris \v{S}iki c, Lothar Thiele, Olga Saukh

Outline

This paper proposes model folding, a novel data-free model compression technique. This technique merges structurally similar neurons across layers, significantly reducing model size without fine-tuning or access to training data. Unlike existing methods, it utilizes k-means clustering and employs a novel data-free technique to prevent variance collapse or explosion, preserving data statistics during compression. Through theoretical frameworks and experiments on standard benchmarks, including ResNet18 and LLaMA-7B, we demonstrate that model folding achieves comparable performance to data-driven compression techniques and outperforms recently proposed data-free methods, particularly at high sparsity levels. This method is particularly effective for large-scale model compression, making it suitable for deployment in resource-constrained environments.

Takeaways, Limitations

Takeaways:
A novel method for model compression without data is presented.
Superior performance at higher sparsity levels than existing data-free methods
Effective for compressing large-scale models, suitable for resource-constrained environments.
Preserving data statistics using k-means clustering
Limitations:
Further research is needed to determine the generalization performance of the proposed method.
Further experiments are needed on various model architectures and datasets.
Sensitivity analysis is needed for parameter settings of k-means clustering.
👍