Towards Understanding Sparse Filtering: A Theoretical Perspective

29 March 2016

Abstract

In this paper we present our study on a recent and effective algorithm for unsupervised learning, that is, sparse filtering. The aim of this research is not to show whether or how well sparse filtering works, but to understand why and when sparse filtering does work. We provide a thorough study of this algorithm through a conceptual evaluation of feature distribution learning, a theoretical analysis of the properties of sparse filtering, and an experimental validation of our conclusions. We argue that sparse filtering works by explicitly maximizing the informativeness of the learned representation through the maximization of the proxy of sparsity, and by implicitly preserving information conveyed by the distribution of the original data through the constraint of structure preservation. In particular, we prove that sparse filtering preserves the cosine neighborhoodness of the data. We validate our statements on artificial and real data sets by applying our theoretical understanding to the explanation of the success of sparse filtering on real-world problems. Our work provides a strong theoretical framework for understanding sparse filtering, it highlights assumptions and conditions for success behind the algorithm, and it provides a fresh insight into developing new feature distribution learning algorithms.

View on arXiv

Comments on this paper