As the deployment of Vision-Language Models (VLMs) in real-world applications increases, previously overlooked safety risks are becoming increasingly apparent. Specifically, seemingly innocuous multimodal inputs can combine to reveal harmful intent, resulting in unsafe model outputs. Safety via Intent Awareness (SIA), a training-free, intent-aware safety framework proposed to address these potential risks, proactively detects harmful intent in multimodal inputs and uses it to generate safe responses. SIA follows three steps: visual abstraction (captioning), intent inference via few-shot Chain of Terror (CoT) prompting, and intent-based response generation. By dynamically adapting to implicit intent inferred from image-text pairs, SIA mitigates harmful outputs without extensive retraining. Extensive experiments on safety benchmarks such as SIUO, MM-SafetyBench, and HoliSafe have shown that SIA consistently improves safety and outperforms existing training-free methods.