Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Multi-View Slot Attention Using Paraphrased Texts for Face Anti-Spoofing

Created by
  • Haebom

Author

Jeongmin Yu, Susang Kim, Kisu Lee, Taekyoung Kwon, Won-Yong Shin, Ha Young Kim

Outline

While CLIP-based face anti-spoofing (FAS) methods have demonstrated remarkable cross-domain performance, existing models fail to fully utilize CLIP's patch embedding tokens, failing to detect important spoofing cues. Furthermore, their reliance on a single text prompt limits generalization. To address these challenges, we propose MVP-FAS, a novel framework that integrates two core modules: Multi-Perspective Slot Attention (MVS) and Multi-Text Patch Alignment (MTPA). MVS leverages multiple texts from different perspectives to extract local fine-grained spatial features and global context from patch embeddings, while MTPA aligns multiple text representations with patches to enhance semantic robustness. Extensive experiments demonstrate that MVP-FAS achieves superior generalization performance compared to existing state-of-the-art methods. The code is available on GitHub.

Takeaways, Limitations

Takeaways:
Overcoming the limitations of existing CLIP-based FAS models by effectively utilizing CLIP's patch embedding tokens.
Improving generalization performance using multiple text prompts
Effectively combine local detailed features and global contextual information through MVS and MTPA modules.
Achieving state-of-the-art performance in cross-domain FAS performance
Publicly available code allows for reproducibility and further research.
Limitations:
Lack of analysis on the computational cost and efficiency of the proposed method.
Lack of robustness assessment against various types of spoofing attacks.
Lack of performance evaluation in real-world environments
👍