Loupe: A Generalizable and Adaptive Framework for Image Forgery Detection

The proliferation of generative models has raised serious concerns about visual content forgery. Existing deepfake detection methods primarily target either image-level classification or pixel-wise localization. While some achieve high accuracy, they often suffer from limited generalization across manipulation types or rely on complex architectures. In this paper, we propose Loupe, a lightweight yet effective framework for joint deepfake detection and localization. Loupe integrates a patch-aware classifier and a segmentation module with conditional queries, allowing simultaneous global authenticity classification and fine-grained mask prediction. To enhance robustness against distribution shifts of test set, Loupe introduces a pseudo-label-guided test-time adaptation mechanism by leveraging patch-level predictions to supervise the segmentation head. Extensive experiments on the DDL dataset demonstrate that Loupe achieves state-of-the-art performance, securing the first place in the IJCAI 2025 Deepfake Detection and Localization Challenge with an overall score of 0.846. Our results validate the effectiveness of the proposed patch-level fusion and conditional query design in improving both classification accuracy and spatial localization under diverse forgery patterns. The code is available atthis https URL.
View on arXiv@article{jiang2025_2506.16819, title={ Loupe: A Generalizable and Adaptive Framework for Image Forgery Detection }, author={ Yuchu Jiang and Jiaming Chu and Jian Zhao and Xin Zhang and Xu Yang and Lei Jin and Chi Zhang and Xuelong Li }, journal={arXiv preprint arXiv:2506.16819}, year={ 2025 } }