This paper focuses on developing a robust system for automatically detecting memes containing hate speech, a serious problem on the Internet. While large-scale multimodal models (LMMs) have shown promising results, they face challenges such as suboptimal performance and limited cross-domain generalization. To address these challenges, we propose a robust adaptive framework that maintains the general vision-language capabilities of LMMs while improving both within-domain accuracy and cross-domain generalization. The proposed method demonstrates robustness against adversarial attacks compared to existing supervised fine-tuning (SFT) models. Experimental results on six meme classification datasets show that it outperforms existing state-of-the-art models and generates higher-quality evidence, thereby enhancing the model's interpretability.