Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

SIA: Enhancing Safety via Intent Awareness for Vision-Language Models

Created by
  • Haebom

Author

Youngjin Na, Sangheon Jeong, Youngwan Lee, Jian Lee, Dawoon Jeong, Youngman Kim

Outline

As the deployment of Vision-Language Models (VLMs) in real-world applications increases, previously overlooked safety risks are becoming increasingly apparent. Specifically, seemingly innocuous multimodal inputs can combine to reveal harmful intent, resulting in unsafe model outputs. Safety via Intent Awareness (SIA), a training-free, intent-aware safety framework proposed to address these potential risks, proactively detects harmful intent in multimodal inputs and uses it to generate safe responses. SIA follows three steps: visual abstraction (captioning), intent inference via few-shot Chain of Terror (CoT) prompting, and intent-based response generation. By dynamically adapting to implicit intent inferred from image-text pairs, SIA mitigates harmful outputs without extensive retraining. Extensive experiments on safety benchmarks such as SIUO, MM-SafetyBench, and HoliSafe have shown that SIA consistently improves safety and outperforms existing training-free methods.

Takeaways, Limitations

A proposed training-free safety framework: Improving safety without extensive retraining.
An Intent-Aware Approach: Effectively Detecting Potential Risks in Multimodal Input.
Experimental results: Outperforms existing methods on various safety benchmarks.
Limitations: Not explicitly mentioned (within the abstract).
👍