Provable Tensor Methods for Learning Mixtures of Classifiers

We consider the problem of learning associative mixtures for classification and regression problems, where the output is modeled as a mixture of conditional distributions, conditioned on the input. In contrast to approaches such as expectation maximization (EM) or variational Bayes, which can get stuck in bad local optima, we present a tensor decomposition method which is guaranteed to correctly recover the parameters. The key insight is to learn score function features of the input, and employ them in a moment-based approach for learning associative mixtures. Specifically, we construct the cross-moment tensor between the label and higher order score functions of the input. We establish that the decomposition of this tensor consistently recovers the components of the associative mixture under some simple non-degeneracy assumptions. Thus, we establish that feature learning is the critical ingredient for consistent estimation of associative mixtures using tensor decomposition approaches.
View on arXiv