This paper addresses the problem of hate memes targeting the LGBTQ+ community evading detection systems with even minor alterations to captions or images. Using the PrideMM dataset, we build the first robustness benchmark by combining four realistic caption attacks and three common image corruptions. Using two state-of-the-art detectors, MemeCLIP and MemeBLIP2, as case studies, we present a lightweight Text Denoising Adapter (TDA) that improves the resilience of MemeBLIP2. Experimental results show that MemeCLIP degrades more gently, while MemeBLIP2 is particularly sensitive to caption editing that interferes with language processing. However, adding TDA not only addresses this weakness, but also makes MemeBLIP2 the most robust model overall. Further analysis reveals that while all systems rely heavily on text, architecture choice and pretraining data significantly impact robustness. This benchmark highlights vulnerabilities in current multimodal safety models and demonstrates that targeted, lightweight modules like TDA are an effective way to achieve stronger defenses.