v1v2 (latest)

A Self-Supervised Descriptor for Image Copy Detection

Computer Vision and Pattern Recognition (CVPR), 2022

21 February 2022

Ed Pizzi

Sreya . Dutta Roy

Sugosh Nagavara Ravindra

Priya Goyal

Matthijs Douze

SSL

ArXiv (abs)PDF HTML Github (298★)

Abstract

Image copy detection is an important task for content moderation. We introduce SSCD, a model that builds on a recent self-supervised contrastive training objective. We adapt this method to the copy detection task by changing the architecture and training objective, including a pooling operator from the instance matching literature, and adapting contrastive learning to augmentations that combine images. Our approach relies on an entropy regularization term, promoting consistent separation between descriptor vectors, and we demonstrate that this significantly improves copy detection accuracy. Our method produces a compact descriptor vector, suitable for real-world web scale applications. Statistical information from a background image distribution can be incorporated into the descriptor. On the recent DISC2021 benchmark, SSCD is shown to outperform both baseline copy detection models and self-supervised architectures designed for image classification by huge margins, in all settings. For example, SSCD out-performs SimCLR descriptors by 48% absolute. Code is available at https://github.com/facebookresearch/sscd-copy-detection.

View on arXiv

Comments on this paper