List-Decodable Robust Mean Estimation and Learning Mixtures of Spherical Gaussians

We study the problem of list-decodable Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. {\bf List-Decodable Mean Estimation.} Fix any and . We design an algorithm with runtime that outputs a list of many candidate vectors such that with high probability one of the candidates is within -distance from the true mean. The only previous algorithm for this problem achieved error under second moment conditions. For , our algorithm runs in polynomial time and achieves error . We also give a Statistical Query lower bound suggesting that the complexity of our algorithm is qualitatively close to best possible. {\bf Learning Mixtures of Spherical Gaussians.} We give a learning algorithm for mixtures of spherical Gaussians that succeeds under significantly weaker separation assumptions compared to prior work. For the prototypical case of a uniform mixture of identity covariance Gaussians we obtain: For any , if the pairwise separation between the means is at least , our algorithm learns the unknown parameters within accuracy with sample complexity and running time . The previously best known polynomial time algorithm required separation at least . Our main technical contribution is a new technique, using degree- multivariate polynomials, to remove outliers from high-dimensional datasets where the majority of the points are corrupted.
View on arXiv