56
0

Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior

Abstract

We study the nonparametric maximum likelihood estimator π^\widehat{\pi} for Gaussian location mixtures in one dimension. It has been known since (Lindsay, 1983) that given an nn-point dataset, this estimator always returns a mixture with at most nn components, and more recently (Wu-Polyanskiy, 2020) gave a sharp O(logn)O(\log n) bound for subgaussian data. In this work we study computational aspects of π^\widehat{\pi}. We provide an algorithm which for small enough ε>0\varepsilon>0 computes an ε\varepsilon-approximation of π^\widehat\pi in Wasserstein distance in time K+Cnk2loglog(1/ε)K+Cnk^2\log\log(1/\varepsilon). Here KK is data-dependent but independent of ε\varepsilon, while CC is an absolute constant and k=supp(π^)nk=|supp(\widehat{\pi})|\leq n is the number of atoms in π^\widehat\pi. We also certifiably compute the exact value of supp(π^)|supp(\widehat\pi)| in finite time. These guarantees hold almost surely whenever the dataset (x1,,xn)[cn1/4,cn1/4](x_1,\dots,x_n)\in [-cn^{1/4},cn^{1/4}] consists of independent points from a probability distribution with a density (relative to Lebesgue measure). We also show the distribution of π^\widehat\pi conditioned to be kk-atomic admits a density on the associated 2k12k-1 dimensional parameter space for all kn/3k\leq \sqrt{n}/3, and almost sure locally linear convergence of the EM algorithm. One key tool is a classical Fourier analytic estimate for non-degenerate curves.

View on arXiv
@article{polyanskiy2025_2503.20193,
  title={ Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior },
  author={ Yury Polyanskiy and Mark Sellke },
  journal={arXiv preprint arXiv:2503.20193},
  year={ 2025 }
}
Comments on this paper