Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline

Created by
  • Haebom

Author

Haiyang Li, Yaxiong Wang, Shengeng Tang, Lianwei Wu, Lechao Cheng, Zhun Zhong

Outline

Research on detecting fake multimodal content on social media is growing in importance. Human-generated misinformation and AI-generated content generated by image synthesis models or visual-language models (VLMs) are the primary forms of deception. Existing research addresses these two types separately, limiting their effectiveness in real-world settings where specific types are unknown. To address this, this paper builds the OmniFake dataset, a comprehensive benchmark consisting of 127,000 samples that integrate human-curated misinformation collected from existing sources and newly synthesized AI-generated examples. Furthermore, we propose the Unified Multimodal Fake Content Detection (UMFDet) framework, designed to address both forms of deception. UMFDet leverages a VLM backbone augmented with a category-aware Mixture-of-Experts (MoE) Adapter and an Attribution Chain-of-Thought mechanism that provides implicit inference guidance to identify key deceptive signals. Experimental results demonstrate that UMFDet achieves robust and consistent performance across both types of misinformation, outperforming specialized baselines and providing a practical solution for real-world multimodal deception detection.

Takeaways, Limitations

Takeaways:
Building the OmniFake dataset provides a comprehensive benchmark that integrates human-generated misinformation with AI-generated content.
With the proposed UMFDet framework, we present an integrated solution that can effectively detect two types of deceptive content.
Improved performance by leveraging the Category-aware MoE Adapter and Attribution Chain-of-Thought mechanism.
Development of a practical multimodal deception detection system applicable in real-world environments.
Limitations:
Further details regarding the specific dataset composition, learning method, and detailed performance analysis in the paper are required.
Lack of information on detailed implementation and parameter settings of the MoE Adapter and Chain-of-Thought mechanism.
Validation of generalization performance across various AI generative models and types of misinformation is needed.
👍