Recently, Musco and Woodruff (FOCS, 2017) showed that given an positive semidefinite (PSD) matrix , it is possible to compute a -approximate relative-error low-rank approximation to by querying entries of in time . They also showed that any relative-error low-rank approximation algorithm must query entries of , this gap has since remained open. Our main result is to resolve this question by obtaining an optimal algorithm that queries entries of and outputs a relative-error low-rank approximation in time. Note, our running time improves that of Musco and Woodruff, and matches the information-theoretic lower bound if the matrix-multiplication exponent is . We then extend our techniques to negative-type distance matrices. Bakshi and Woodruff (NeurIPS, 2018) showed a bi-criteria, relative-error low-rank approximation which queries entries and outputs a rank- matrix. We show that the bi-criteria guarantee is not necessary and obtain an query algorithm, which is optimal. Our algorithm applies to all distance matrices that arise from metrics satisfying negative-type inequalities, including spherical metrics and hypermetrics. Next, we introduce a new robust low-rank approximation model which captures PSD matrices that have been corrupted with noise. While a sample complexity lower bound precludes sublinear algorithms for arbitrary PSD matrices, we provide the first sublinear time and query algorithms when the corruption on the diagonal entries is bounded. As a special case, we show sample-optimal sublinear time algorithms for low-rank approximation of correlation matrices corrupted by noise.
View on arXiv