This paper presents several methods for lossy compression (quantization) of large matrices. This quantization is crucial for accelerating matrix multiplication, a core element of large-scale language models, and the speed of loading these matrices from memory is a bottleneck. Unlike classical vector quantization and rate-distortion theory, this paper aims to approximate matrix multiplication rather than the matrices themselves. It provides a non-asymptotic lower bound on the approximation (mean-squared error) for matrices A and B with iid Gaussian entries. Furthermore, we construct a universal quantizer based on nested lattices that explicitly guarantees the approximation error for any pair of matrices and show that this quantizer is asymptotically optimal for iid Gaussian matrices, achieving a lower bound. A practical, low-complexity version of the quantizer exhibits near-optimal performance, derives the rate-distortion function for iid Gaussian matrix multiplication, and demonstrates the necessity of Johnson-Lindestrauss dimensionality reduction (sketching) in the low-speed region.