This paper explores a technique for representing model parameters and associated tensors using fewer bits, utilizing the Microscaling (MX) format introduced in NVIDIA's Blackwell generation GPUs. The MX format combines a narrow floating-point data type with finer-grained block-wise scaling factors, enabling more tensor quantization and more efficient computation than conventional approaches. We examine various parameter choices for effective use of the MX format and present a method for achieving results comparable to those achieved using BF16 using the MXFP8-E4M3 data type and a specific transformation algorithm. We demonstrate training models with up to 8 billion parameters on high-quality datasets of up to 15 trillion tokens.