This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
Two-Stage Regularization-Based Structured Pruning for LLMs
Created by
Haebom
Author
Mingkuan Feng, Jinyang Wu, Siyuan Liu, Shuai Zhang, Ruihan Jin, Feihu Che, Pengpeng Shao, Zhengqi Wen, Jianhua Tao
Outline
In this paper, we propose a novel structural pruning method, Two-Stage Regularization-Based Structured Pruning (TRSP), for efficient deployment of large-scale language models (LLMs). Unlike existing methods, TRSP reduces the model size while preserving important information through a two-stage regularization process instead of directly removing unimportant parameters. In the first stage, the output of each transformer layer is multiplied by a learnable weight and the $\ell_1$-norm is added as a regularization term to learn the weights. In the second stage, additional regularization is applied to the difference between the input and output of the layer with small weights to move the information to the preserved layer. Experimental results show that TRSP outperforms existing robust layer-wise structural pruning methods without retraining and provides significant speedup.
Takeaways, Limitations
•
Takeaways:
◦
A novel structural pruning method for efficient distribution of LLM is presented.
◦
Maintains superior performance over existing methods without retraining.
◦
Significant speedups achieved through layer-by-layer pruning.
◦
Effective information preservation via two-step regularization based on $\ell_1$-norm.
•
Limitations:
◦
The performance of the proposed method may depend on specific LLM architectures or datasets.
◦
Further comparative analysis with other pruning methods or optimization techniques may be needed.
◦
Further experiments are needed to investigate the performance and efficiency when applied to very large LLMs.
◦
Further research is needed on hyperparameter tuning of the two-step regularization process.