This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
Created by
Haebom
Author
Xujia Wang, Yunjia Qi, Bin Xu
Outline
Parameter-Efficient Fine-Tuning (PEFT) methods, such as LoRA, significantly reduce the number of learnable parameters by introducing low-rank decomposition matrices. However, existing methods perform a large number of matrix multiplications in domain-specific tasks, resulting in low computational efficiency and poor fine-tuning performance. In this paper, we propose Low-Resources Subnet Integration Adaptation (LoSiA), an innovative method that dynamically identifies and optimizes important parameters during the training process. Specifically, we use gradient sparsity analysis to identify subnetworks and optimize them as learnable targets. This design enables effective high-rank adaptation by updating only subnetwork parameters, reducing additional matrix multiplications. We also present LoSiA-Pro, a faster implementation of LoSiA that reduces training latency by approximately 27% compared to LoRA. Extensive evaluation results demonstrate that the proposed method requires the shortest training time for domain-specific and common-sense reasoning tasks while minimizing performance degradation compared to full fine-tuning. Further analysis confirms that LoSiA also reduces forgetting during continuous training. The source code can be found at https://github.com/KlozeWang/LoSiA .
A new method, LoSiA, is proposed to solve the computational inefficiency problem of existing PEFT methods.
◦
Efficient subnetwork optimization possible through gradient sparsity analysis.
◦
Minimizes performance degradation and reduces training time compared to full fine-tuning.
◦
Confirming the effect of reducing forgetting during continuous learning.
◦
LoSiA-Pro implementation is faster than LoRA.
•
Limitations:
◦
Further research is needed to determine whether the performance improvements of LoSiA presented in this paper can be generalized to all types of models and tasks.
◦
Lack of in-depth discussion of Limitations of gradient sparsity analysis as a subnetwork selection criterion.
◦
Lack of analysis of LoSiA performance changes according to various hyperparameter settings.