Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Created by
  • Haebom

Author

Klaudia Ba{\l}azy, Mohammadreza Banaei, Karl Aberer, Jacek Tabor

Outline

This paper presents LoRA-XS, a novel parameter-efficient fine-tuning method to address the limitations of LoRA, which suffer from storage and computational difficulties when deploying modules for various tasks or users. LoRA-XS dramatically reduces the number of trainable parameters by incorporating small trainable weight matrices among fixed low-rank matrices obtained from the singular value decomposition (SVD) of pre-trained weights. Compared to LoRA in a 7B model, it reduces storage requirements by over 100x and scales from one parameter per module to any arbitrary size. Evaluations on GLUE, GSM8K, MATH, and common-sense inference benchmarks demonstrate that LoRA-XS performs equally or better in accuracy than LoRA and VeRA, while offering superior parameter efficiency. Additional experiments highlighting the importance of singular vectors demonstrate the utility of LoRA-XS as a robust and storage-efficient solution for scaling and personalizing large-scale language models.

Takeaways, Limitations

Takeaways:
We present a novel fine-tuning method, LoRA-XS, that effectively addresses the storage and computational cost issues of LoRA.
7B model saves over 100x storage space compared to LoRA.
Flexible number of trainable parameters (from one parameter per module to arbitrary size).
Achieves equivalent or superior accuracy compared to LoRA and VeRA on GLUE, GSM8K, MATH, and common sense reasoning benchmarks.
Experimental demonstration of the importance of singular vectors in transformer weights.
Providing an efficient solution for scaling and personalizing large-scale language models.
Limitations:
Further research is needed to determine the generalizability of the experimental results presented in this paper.
More extensive experimentation with different model sizes and tasks is needed.
There is a need to review the possibility that the performance improvements of LoRA-XS may be biased towards specific datasets or tasks.
👍