37
33

Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models

Abstract

We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on nn i.i.d. samples are known to have lower accuracy than the classical n12n^{- \frac{1}{2}} error. We investigate whether the Expectation-Maximization (EM) algorithm also converges slowly for these models. We provide a rigorous characterization of EM for fitting a weakly identifiable Gaussian mixture in a univariate setting where we prove that the EM algorithm converges in order n34n^{\frac{3}{4}} steps and returns estimates that are at a Euclidean distance of order n18{ n^{- \frac{1}{8}}} and n14{ n^{-\frac{1} {4}}} from the true location and scale parameter respectively. Establishing the slow rates in the univariate setting requires a novel localization argument with two stages, with each stage involving an epoch-based argument applied to a different surrogate EM operator at the population level. We demonstrate several multivariate (d2d \geq 2) examples that exhibit the same slow rates as the univariate case. We also prove slow statistical rates in higher dimensions in a special case, when the fitted covariance is constrained to be a multiple of the identity.

View on arXiv
Comments on this paper