We consider learning in an adversarial environment, where an -fraction of samples from a distribution are arbitrarily modified (global corruptions) and the remaining perturbations have average magnitude bounded by (local corruptions). Given access to such corrupted samples, we seek a computationally efficient estimator that minimizes the Wasserstein distance . In fact, we attack the fine-grained task of minimizing for all orthogonal projections , with performance scaling with . This allows us to account simultaneously for mean estimation (), distribution estimation (), as well as the settings interpolating between these two extremes. We characterize the optimal population-limit risk for this task and then develop an efficient finite-sample algorithm with error bounded by when has bounded covariance. This guarantee holds uniformly in and is minimax optimal up to the sub-optimality of the plug-in estimator when . Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator. We apply this algorithm to robust stochastic optimization, and, in the process, uncover a new method for overcoming the curse of dimensionality in Wasserstein distributionally robust optimization.
View on arXiv