Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Recipes for Pre-training LLMs with MXFP8

Created by
  • Haebom

Author

Asit Mishra, Dusan Stosic, Simon Layton, Paulius Micikevicius

Outline

This paper explores a technique for representing model parameters and associated tensors using fewer bits, utilizing the Microscaling (MX) format introduced in NVIDIA's Blackwell generation GPUs. The MX format combines a narrow floating-point data type with finer-grained block-wise scaling factors, enabling more tensor quantization and more efficient computation than conventional approaches. We examine various parameter choices for effective use of the MX format and present a method for achieving results comparable to those achieved using BF16 using the MXFP8-E4M3 data type and a specific transformation algorithm. We demonstrate training models with up to 8 billion parameters on high-quality datasets of up to 15 trillion tokens.

Takeaways, Limitations

Takeaways:
Presenting an efficient model training method using the MX format.
Proposed MXFP8-E4M3 data type and conversion algorithm that achieves the same performance as BF16.
Presenting experimental results on large-scale models (up to 8 billion parameters) and large-scale datasets (up to 15 trillion tokens).
Suggests potential for improved GPU memory efficiency and learning speed.
Limitations:
Further research is needed to determine whether the proposed method is applicable to all models and datasets.
Results may be dependent on specific NVIDIA GPU architectures.
Lack of comparative analysis with other quantization techniques
👍