A Nearly Optimal and Agnostic Algorithm for Properly Learning a Mixture of k Gaussians, for any Constant k

Learning a Gaussian mixture model (GMM) is a fundamental problem in machine learning, learning theory, and statistics. One notion of learning a GMM is proper learning: here, the goal is to find a mixture of Gaussians that is close to the density of the unknown distribution from which we draw samples. The distance between and is typically measured in the total variation or -norm. We give an algorithm for learning a mixture of univariate Gaussians that is nearly optimal for any fixed . The sample complexity of our algorithm is and the running time is . It is well-known that this sample complexity is optimal (up to logarithmic factors), and it was already achieved by prior work. However, the best known time complexity for proper learning a -GMM was . In particular, the dependence between and was exponential. We significantly improve this dependence by replacing the term with a while only increasing the exponent moderately. Hence, for any fixed , the term dominates our running time, and thus our algorithm runs in time which is nearly-linear in the number of samples drawn. Achieving a running time of for proper learning of -GMMs has recently been stated as an open problem by multiple researchers, and we make progress on this question. Moreover, our approach offers an agnostic learning guarantee: our algorithm returns a good GMM even if the distribution we are sampling from is not a mixture of Gaussians. To the best of our knowledge, our algorithm is the first agnostic proper learning algorithm for GMMs.
View on arXiv