Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models

Created by
  • Haebom

Author

Liulu He, Shenli Zheng, Karwei Sun, Yijiang Liu, Yufei Zhao, Chongkang Tan, Huanrui Yang, Yuan Du, Li Du

Outline

This paper introduces BASE-Q, a proposed method to enhance the effectiveness of rotation techniques in the quantization pipeline of large-scale language models (LLMs). Existing rotation-based quantization methods suffer from channel-mean misalignment and increased rounding and clipping errors due to Gaussian activation distributions. BASE-Q effectively reduces these errors by combining bias correction and asymmetric scaling. Furthermore, it eliminates memory-intensive full-model backpropagation through block-wise optimization. Experimental results on various LLMs and benchmarks demonstrate that BASE-Q reduces accuracy losses by 50.5%, 42.9%, and 29.2%, respectively, compared to existing methods (QuaRot, SpinQuant, and OSTQuant).

Takeaways, Limitations

Takeaways:
We clearly presented the Limitations (failure to align channel averages, increased error due to Gaussian distribution) of the existing rotation-based quantization method and proposed an effective method (BASE-Q) to solve these problems.
BASE-Q significantly improves memory efficiency through block-level optimization.
It shows excellent performance improvement over existing methods in various LLMs and benchmarks.
Limitations:
The code has not been released yet.
Experimental results on various LLMs and benchmarks are presented, but there may be a lack of analysis on cases where performance is excessively good or bad for a specific LLM or benchmark.
A detailed description of BASE-Q's block-level optimization strategy may be lacking.
👍