503
v1v2 (latest)

Sparse sketches with small inversion bias

Annual Conference Computational Learning Theory (COLT), 2020
Abstract

For a tall n×dn\times d matrix AA and a random m×nm\times n sketching matrix SS, the sketched estimate of the inverse covariance matrix (AA)1(A^\top A)^{-1} is typically biased: E[(A~A~)1](AA)1E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}, where A~=SA\tilde A=SA. This phenomenon, which we call inversion bias, arises, e.g., in statistics and distributed optimization, when averaging multiple independently constructed estimates of quantities that depend on the inverse covariance. We develop a framework for analyzing inversion bias, based on our proposed concept of an (ϵ,δ)(\epsilon,\delta)-unbiased estimator for random matrices. We show that when the sketching matrix SS is dense and has i.i.d. sub-gaussian entries, then after simple rescaling, the estimator (mmdA~A~)1(\frac m{m-d}\tilde A^\top\tilde A)^{-1} is (ϵ,δ)(\epsilon,\delta)-unbiased for (AA)1(A^\top A)^{-1} with a sketch of size m=O(d+d/ϵ)m=O(d+\sqrt d/\epsilon). This implies that for m=O(d)m=O(d), the inversion bias of this estimator is O(1/d)O(1/\sqrt d), which is much smaller than the Θ(1)\Theta(1) approximation error obtained as a consequence of the subspace embedding guarantee for sub-gaussian sketches. We then propose a new sketching technique, called LEverage Score Sparsified (LESS) embeddings, which uses ideas from both data-oblivious sparse embeddings as well as data-aware leverage-based row sampling methods, to get ϵ\epsilon inversion bias for sketch size m=O(dlogd+d/ϵ)m=O(d\log d+\sqrt d/\epsilon) in time O(nnz(A)logn+md2)O(\text{nnz}(A)\log n+md^2), where nnz is the number of non-zeros. The key techniques enabling our analysis include an extension of a classical inequality of Bai and Silverstein for random quadratic forms, which we call the Restricted Bai-Silverstein inequality; and anti-concentration of the Binomial distribution via the Paley-Zygmund inequality, which we use to prove a lower bound showing that leverage score sampling sketches generally do not achieve small inversion bias.

View on arXiv
Comments on this paper