Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models

We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on i.i.d. samples are known to have lower accuracy than the classical error. We investigate whether the Expectation-Maximization (EM) algorithm also converges slowly for these models. We provide a rigorous characterization of EM for fitting a weakly identifiable Gaussian mixture in a univariate setting where we prove that the EM algorithm converges in order steps and returns estimates that are at a Euclidean distance of order and from the true location and scale parameter respectively. Establishing the slow rates in the univariate setting requires a novel localization argument with two stages, with each stage involving an epoch-based argument applied to a different surrogate EM operator at the population level. We demonstrate several multivariate () examples that exhibit the same slow rates as the univariate case. We also prove slow statistical rates in higher dimensions in a special case, when the fitted covariance is constrained to be a multiple of the identity.
View on arXiv