29
17

Dictionary Learning with Few Samples and Matrix Concentration

Abstract

Let AA be an n×nn \times n matrix, XX be an n×pn \times p matrix and Y=AXY = AX. A challenging and important problem in data analysis, motivated by dictionary learning and other practical problems, is to recover both AA and XX, given YY. Under normal circumstances, it is clear that this problem is underdetermined. However, in the case when XX is sparse and random, Spielman, Wang and Wright showed that one can recover both AA and XX efficiently from YY with high probability, given that pp (the number of samples) is sufficiently large. Their method works for pCn2log2np \ge C n^2 \log^ 2 n and they conjectured that pCnlognp \ge C n \log n suffices. The bound nlognn \log n is sharp for an obvious information theoretical reason. In this paper, we show that pCnlog4np \ge C n \log^4 n suffices, matching the conjectural bound up to a polylogarithmic factor. The core of our proof is a theorem concerning l1l_1 concentration of random matrices, which is of independent interest. Our proof of the concentration result is based on two ideas. The first is an economical way to apply the union bound. The second is a refined version of Bernstein's concentration inequality for the sum of independent variables. Both have nothing to do with random matrices and are applicable in general settings.

View on arXiv
Comments on this paper