DFBench: Benchmarking Deepfake Image Detection Capability of Large Multimodal Models

3 June 2025

Main:6 Pages

5 Figures

Bibliography:2 Pages

4 Tables

Abstract

With the rapid advancement of generative models, the realism of AI-generated images has significantly improved, posing critical challenges for verifying digital content authenticity. Current deepfake detection methods often depend on datasets with limited generation models and content diversity that fail to keep pace with the evolving complexity and increasing realism of the AI-generated content. Large multimodal models (LMMs), widely adopted in various vision tasks, have demonstrated strong zero-shot capabilities, yet their potential in deepfake detection remains largely unexplored. To bridge this gap, we present \textbf{DFBench}, a large-scale DeepFake Benchmark featuring (i) broad diversity, including 540,000 images across real, AI-edited, and AI-generated content, (ii) latest model, the fake images are generated by 12 state-of-the-art generation models, and (iii) bidirectional benchmarking and evaluating for both the detection accuracy of deepfake detectors and the evasion capability of generative models. Based on DFBench, we propose \textbf{MoA-DF}, Mixture of Agents for DeepFake detection, leveraging a combined probability strategy from multiple LMMs. MoA-DF achieves state-of-the-art performance, further proving the effectiveness of leveraging LMMs for deepfake detection. Database and codes are publicly available atthis https URL.

View on arXiv

@article{wang2025_2506.03007,
  title={ DFBench: Benchmarking Deepfake Image Detection Capability of Large Multimodal Models },
  author={ Jiarui Wang and Huiyu Duan and Juntong Wang and Ziheng Jia and Woo Yi Yang and Xiaorong Zhu and Yu Zhao and Jiaying Qian and Yuke Xing and Guangtao Zhai and Xiongkuo Min },
  journal={arXiv preprint arXiv:2506.03007},
  year={ 2025 }
}

Comments on this paper