First Efficient Convergence for Streaming k-PCA: a Global, Gap-Free, and Near-Optimal Rate

We study streaming principal component analysis (PCA), that is to find, in space, the top eigenvectors of a hidden matrix with online vectors drawn from covariance matrix . We provide convergence for Oja's algorithm which is popularly used in practice but lacks theoretical understanding for . We also provide a modified variant that runs than Oja's. Our results match the information theoretic lower bound in terms of dependency on error, on eigengap, on rank , and on dimension , up to poly-log factors. In addition, our convergence rate can be made gap-free, that is proportional to the approximation error and independent of the eigengap. In contrast, for general rank , before our work (1) it was open to design any algorithm with efficient global convergence rate; and (2) it was open to design any algorithm with (even local) gap-free convergence rate in space.
View on arXiv