29
30

Smooth pp-Wasserstein Distance: Structure, Empirical Approximation, and Statistical Applications

Abstract

Discrepancy measures between probability distributions, often termed statistical distances, are ubiquitous in probability theory, statistics and machine learning. To combat the curse of dimensionality when estimating these distances from data, recent work has proposed smoothing out local irregularities in the measured distributions via convolution with a Gaussian kernel. Motivated by the scalability of this framework to high dimensions, we investigate the structural and statistical behavior of the Gaussian-smoothed pp-Wasserstein distance Wp(σ)\mathsf{W}_p^{(\sigma)}, for arbitrary p1p\geq 1. After establishing basic metric and topological properties of Wp(σ)\mathsf{W}_p^{(\sigma)}, we explore the asymptotic statistical behavior of Wp(σ)(μ^n,μ)\mathsf{W}_p^{(\sigma)}(\hat{\mu}_n,\mu), where μ^n\hat{\mu}_n is the empirical distribution of nn independent observations from μ\mu. We prove that Wp(σ)\mathsf{W}_p^{(\sigma)} enjoys a parametric empirical convergence rate of n1/2n^{-1/2}, which contrasts the n1/dn^{-1/d} rate for unsmoothed Wp\mathsf{W}_p when d3d \geq 3. Our proof relies on controlling Wp(σ)\mathsf{W}_p^{(\sigma)} by a ppth-order smooth Sobolev distance dp(σ)\mathsf{d}_p^{(\sigma)} and deriving the limit distribution of ndp(σ)(μ^n,μ)\sqrt{n}\,\mathsf{d}_p^{(\sigma)}(\hat{\mu}_n,\mu), for all dimensions dd. As applications, we provide asymptotic guarantees for two-sample testing and minimum distance estimation using Wp(σ)\mathsf{W}_p^{(\sigma)}, with experiments for p=2p=2 using a maximum mean discrepancy formulation of d2(σ)\mathsf{d}_2^{(\sigma)}.

View on arXiv
Comments on this paper