This paper proposes MQuant, a post-training quantization (PTQ) framework for efficient inference of multimodal large-scale language models (MLLMs). To address the challenges of practical deployment and application due to the large parameter size and high computational demands of MLLMs, MQuant introduces modal-specific static quantization (MSQ), attention-invariant flexible switching (AIFS), and rotation scale suppression (RMS) to achieve superior performance over existing PTQ baselines. MSQ assigns separate static scales to visual and textual tokens. AIFS eliminates computationally expensive per-token scale calculations while maintaining casual attention by rearranging token order. RMS mitigates weight outliers caused by online Hadamard rotations. We demonstrate that MQuant reduces inference latency by up to 30% on five leading MLLMs, including Qwen-VL, MiniCPM-V, and CogVLM2, while maintaining near-equivalent floating-point accuracy (<1% degradation) under W4A8. The source code is available on GitHub.