74
0

Efficient Multivariate Robust Mean Estimation Under Mean-Shift Contamination

Abstract

We study the algorithmic problem of robust mean estimation of an identity covariance Gaussian in the presence of mean-shift contamination. In this contamination model, we are given a set of points in Rd\mathbb{R}^d generated i.i.d. via the following process. For a parameter α<1/2\alpha<1/2, the ii-th sample xix_i is obtained as follows: with probability 1α1-\alpha, xix_i is drawn from N(μ,I)\mathcal{N}(\mu, I), where μRd\mu \in \mathbb{R}^d is the target mean; and with probability α\alpha, xix_i is drawn from N(zi,I)\mathcal{N}(z_i, I), where ziz_i is unknown and potentially arbitrary. Prior work characterized the information-theoretic limits of this task. Specifically, it was shown that, in contrast to Huber contamination, in the presence of mean-shift contamination consistent estimation is possible. On the other hand, all known robust estimators in the mean-shift model have running times exponential in the dimension. Here we give the first computationally efficient algorithm for high-dimensional robust mean estimation with mean-shift contamination that can tolerate a constant fraction of outliers. In particular, our algorithm has near-optimal sample complexity, runs in sample-polynomial time, and approximates the target mean to any desired accuracy. Conceptually, our result contributes to a growing body of work that studies inference with respect to natural noise models lying in between fully adversarial and random settings.

View on arXiv
@article{diakonikolas2025_2502.14772,
  title={ Efficient Multivariate Robust Mean Estimation Under Mean-Shift Contamination },
  author={ Ilias Diakonikolas and Giannis Iakovidis and Daniel M. Kane and Thanasis Pittas },
  journal={arXiv preprint arXiv:2502.14772},
  year={ 2025 }
}
Comments on this paper