On the Sample Complexity of Predictive Sparse Coding

18 February 2012

Abstract

Predictive sparse coding algorithms recently have demonstrated impressive performance on a variety of supervised tasks, but they lack a learning theoretic analysis. We establish the first generalization bounds for predictive sparse coding. In the overcomplete dictionary learning setting, where the dictionary size k exceeds the dimensionality d of the data, we present an estimation error bound that is roughly O(sqrt(dk/m) + sqrt(s)/({\mu}m)). In the infinite-dimensional setting, we show a dimension-free bound that is roughly O(k sqrt(s)/({\mu} m)). The quantity {\mu} is a measure of the incoherence of the dictionary and s is the sparsity level. Both bounds are data-dependent, explicitly taking into account certain incoherence properties of the learned dictionary and the sparsity level of the codes learned on actual data.

View on arXiv

Comments on this paper