Large-scale multimodal models (LMMs) have been extensively tested in tasks such as visual question answering (VQA), image caption generation, and grounding, but rigorous evaluations of their alignment with human-centered (HC) values such as fairness, ethics, and inclusivity are lacking. To address this gap, this paper presents HumaniBench , a novel benchmark consisting of 32,000 real-world image-question pairs and an evaluation tool . Labels are generated through an AI-assisted pipeline and validated by experts. HumaniBench evaluates LMMs across a variety of open and closed VQA tasks based on seven key alignment principles: fairness, ethics, empathy, inclusivity, inference, robustness, and multilingualism. These principles, grounded in AI ethics and practical requirements, provide a holistic view of social impact. Benchmarking results across various LMMs show that proprietary models generally outperform inference, fairness, and multilingualism, while open-source models outperform in robustness and grounding. Most models struggle to balance accuracy with ethical and inclusive behavior. Techniques such as thought chain prompting and test time scaling improve alignment. As the first benchmark tailored for HC alignment, HumaniBench provides a rigorous testbed to diagnose limitations and promote responsible LMM development. All data and code are publicly available for reproducibility.