Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression

In this paper, we obtain improved running times for regression and top eigenvector computation for numerically sparse matrices. Given a data matrix where every row has and numerical sparsity at most , i.e. , we provide faster algorithms for these problems in many parameter settings. For top eigenvector computation, we obtain a running time of where is the relative gap between the top two eigenvectors of and is the stable rank of . This running time improves upon the previous best unaccelerated running time of as it is always the case that and . For regression, we obtain a running time of where is the smallest eigenvalue of . This running time improves upon the previous best unaccelerated running time of . This result expands the regimes where regression can be solved in nearly linear time from when to when . Furthermore, we obtain similar improvements even when row norms and numerical sparsities are non-uniform and we show how to achieve even faster running times by accelerating using approximate proximal point [Frostig et. al. 2015] / catalyst [Lin et. al. 2015]. Our running times depend only on the size of the input and natural numerical measures of the matrix, i.e. eigenvalues and norms, making progress on a key open problem regarding optimal running times for efficient large-scale learning.
View on arXiv