22
13

List-Decodable Sparse Mean Estimation via Difference-of-Pairs Filtering

Abstract

We study the problem of list-decodable sparse mean estimation. Specifically, for a parameter α(0,1/2)\alpha \in (0, 1/2), we are given mm points in Rn\mathbb{R}^n, αm\lfloor \alpha m \rfloor of which are i.i.d. samples from a distribution DD with unknown kk-sparse mean μ\mu. No assumptions are made on the remaining points, which form the majority of the dataset. The goal is to return a small list of candidates containing a vector μ^\widehat \mu such that μ^μ2\| \widehat \mu - \mu \|_2 is small. Prior work had studied the problem of list-decodable mean estimation in the dense setting. In this work, we develop a novel, conceptually simpler technique for list-decodable mean estimation. As the main application of our approach, we provide the first sample and computationally efficient algorithm for list-decodable sparse mean estimation. In particular, for distributions with "certifiably bounded" tt-th moments in kk-sparse directions and sufficiently light tails, our algorithm achieves error of (1/α)O(1/t)(1/\alpha)^{O(1/t)} with sample complexity m=(klog(n))O(t)/αm = (k\log(n))^{O(t)}/\alpha and running time poly(mnt)\mathrm{poly}(mn^t). For the special case of Gaussian inliers, our algorithm achieves the optimal error guarantee of Θ(log(1/α))\Theta (\sqrt{\log(1/\alpha)}) with quasi-polynomial sample and computational complexity. We complement our upper bounds with nearly-matching statistical query and low-degree polynomial testing lower bounds.

View on arXiv
Comments on this paper