Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Orthogonal Finetuning Made Scalable

Created by
  • Haebom

Author

Zeju Qiu, Weiyang Liu, Adrian Weller, Bernhard Sch olkopf

Outline

This paper proposes Orthogonal Finetuning (OFTv2) to address the limitations of Orthogonal Finetuning (OFT), which limits its practical application due to its high runtime and memory requirements. OFTv2 reduces computational costs by restructuring OFT's core computational bottleneck, the weight-centric implementation, into an input-centric one. It also introduces Cayley-Neumann parameterization to implement efficient orthogonal parameterization. This achieves up to 10x faster training and 3x lower GPU memory usage. Furthermore, it supports fine-tuning of quantized base models, outperforming QLoRA.

Takeaways, Limitations

Takeaways:
OFTv2 solves the computational cost problem of the existing OFT, improving training speed and memory efficiency.
An efficient orthogonal parameterization was implemented via Cayley-Neumann parameterization.
It showed better performance than QLoRA with fine-tuning support for quantized base models.
Limitations:
The paper may lack detailed information on specific performance comparisons and experimental setups.
Further research may be needed on the generalization performance of OFTv2 and its applicability to various model architectures.
Additional tuning or optimization may be required for actual application.
👍