This paper proposes MindVL, a multimodal large-scale language model (MLLM) trained on Ascend NPUs. MindVL addresses the limited hardware platform dependency and closed data recipe issues of existing MLLM training. It supports stable and high-performance training of large-scale Dense and Mixture-of-Experts (MoE) models on Ascend hardware through an efficient training framework called MindSpeed-MLLM. Furthermore, it provides a systematic and open description of the training data generation method and mixing strategy. MindVL is a data-efficient MLLM trained end-to-end on Ascend NPUs. It improves performance by averaging the weights of checkpoints trained with various sequence lengths and by employing a test-time resolution search technique. MindVL-8B achieves the same performance as Qwen2.5VL-7B with 10% of the data, and MindVL-671B-A37B, an MoE model, shows similar performance with 3% of the data of Qwen2.5VL-72B.