Whitening for Self-Supervised Representation Learning

International Conference on Machine Learning (ICML), 2020

13 July 2020

ArXiv (abs)PDF HTML Github (129★)

Abstract

Most of the current self-supervised representation learning methods are based on the contrastive loss and the instance-discrimination task, where augmented versions of the same image instance ("positives") are contrasted with instances extracted from other images ("negatives"). For the learning to be effective, many negatives should be compared with a positive pair, which is computationally demanding. In this paper, we propose a different direction and a new loss function for self-supervised representation learning which is based on the whitening of the latent-space features. The whitening operation has a "scattering" effect on the batch samples, which compensates the use of negatives, avoiding degenerate solutions where all the sample representations collapse to a single point. Our Whitening MSE (W-MSE) loss does not require additional momentum networks and it is conceptually simple. Moreover, since negatives are not needed, we can extract multiple positive pairs from the same image instance. We empirically show that W-MSE is competitive with respect to popular, more complex self-supervised methods. The source code of the method and of all the experiments is available at https://github.com/htdt/self-supervised.

View on arXiv

Comments on this paper