We propose MindVL, a multimodal large-scale language model (MLLM) trained on Ascend NPUs. MindVL aims to overcome the dependence on closed data recipes and hardware platforms that hinder open research and reproducibility. It supports stable, high-performance training of large-scale Dense and Mixture-of-Experts (MoE) models on Ascend hardware through an efficient training framework called MindSpeed-MLLM. Furthermore, we provide a systematic and open explanation of the data preparation method and mixing strategy. MindVL is a data-efficient MLLM trained end-to-end on Ascend NPUs. We improve performance by combining weight averaging across checkpoints of various sequence lengths and test-time resolution search. MindVL-8B achieves comparable performance to Qwen2.5VL-7B with 10% of the data, and MindVL-671B-A37B, an MoE model, achieves comparable performance with 3% of the data of Qwen2.5VL-72B.