We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation distance. Let be an arbitrary class of probability distributions, and let denote the class of -mixtures of elements of . Assuming the existence of a method for learning with sample complexity , we provide a method for learning with sample complexity . Our mixture learning algorithm has the property that, if the -learner is proper/agnostic, then the -learner would be proper/agnostic as well. This general result enables us to improve the best known sample complexity upper bounds for a variety of important mixture classes. First, we show that the class of mixtures of axis-aligned Gaussians in is PAC-learnable in the agnostic setting with samples, which is tight in and up to logarithmic factors. Second, we show that the class of mixtures of Gaussians in is PAC-learnable in the agnostic setting with sample complexity , which improves the previous known bounds of and in its dependence on and . Finally, we show that the class of mixtures of log-concave distributions over is PAC-learnable using samples.
View on arXiv