14
146

List-Decodable Robust Mean Estimation and Learning Mixtures of Spherical Gaussians

Abstract

We study the problem of list-decodable Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. {\bf List-Decodable Mean Estimation.} Fix any dZ+d \in \mathbb{Z}_+ and 0<α<1/20< \alpha <1/2. We design an algorithm with runtime O(poly(n/α)d)O (\mathrm{poly}(n/\alpha)^{d}) that outputs a list of O(1/α)O(1/\alpha) many candidate vectors such that with high probability one of the candidates is within 2\ell_2-distance O(α1/(2d))O(\alpha^{-1/(2d)}) from the true mean. The only previous algorithm for this problem achieved error O~(α1/2)\tilde O(\alpha^{-1/2}) under second moment conditions. For d=O(1/ϵ)d = O(1/\epsilon), our algorithm runs in polynomial time and achieves error O(αϵ)O(\alpha^{\epsilon}). We also give a Statistical Query lower bound suggesting that the complexity of our algorithm is qualitatively close to best possible. {\bf Learning Mixtures of Spherical Gaussians.} We give a learning algorithm for mixtures of spherical Gaussians that succeeds under significantly weaker separation assumptions compared to prior work. For the prototypical case of a uniform mixture of kk identity covariance Gaussians we obtain: For any ϵ>0\epsilon>0, if the pairwise separation between the means is at least Ω(kϵ+log(1/δ))\Omega(k^{\epsilon}+\sqrt{\log(1/\delta)}), our algorithm learns the unknown parameters within accuracy δ\delta with sample complexity and running time poly(n,1/δ,(k/ϵ)1/ϵ)\mathrm{poly} (n, 1/\delta, (k/\epsilon)^{1/\epsilon}). The previously best known polynomial time algorithm required separation at least k1/4polylog(k/δ)k^{1/4} \mathrm{polylog}(k/\delta). Our main technical contribution is a new technique, using degree-dd multivariate polynomials, to remove outliers from high-dimensional datasets where the majority of the points are corrupted.

View on arXiv
Comments on this paper