277

Affine Invariant Covariance Estimation for Heavy-Tailed Distributions

Annual Conference Computational Learning Theory (COLT), 2019
Abstract

In this work we provide an estimator for the covariance matrix of a heavy-tailed random vector. We prove that the proposed estimator S^\widehat{\mathbf{S}} admits \textit{affine-invariant} bounds of the form (1ε)SS^(1+ε)S(1-\varepsilon) \mathbf{S} \preccurlyeq \widehat{\mathbf{S}} \preccurlyeq (1+\varepsilon) \mathbf{S}in high probability, where S\mathbf{S} is the unknown covariance matrix, and \preccurlyeq is the positive semidefinite order on symmetric matrices. The result only requires the existence of fourth-order moments, and allows for ε=O(κ4d/n)\varepsilon = O(\sqrt{\kappa^4 d/n}) where κ4\kappa^4 is some measure of kurtosis of the distribution, dd is the dimensionality of the space, and nn is the sample size. More generally, we can allow for regularization with level~λ\lambda, then ε\varepsilon depends on the degrees of freedom number which is generally smaller than dd. The computational cost of the proposed estimator is essentially~O(d2n+d3)O(d^2 n + d^3), comparable to the computational cost of the sample covariance matrix in the statistically interesting regime~ndn \gg d. Its applications to eigenvalue estimation with relative error and to ridge regression with heavy-tailed random design are discussed.

View on arXiv
Comments on this paper