We propose a novel estimator for the number of components (denoted by ) in a K-variate non-parametric finite mixture model, where the analyst has repeated observations of variables that are independent given a finitely supported unobserved variable. Under a mild assumption on the joint distribution of the observed and latent variables, we show that an integral operator , that is identified from the data, has rank equal to . Using this observation, and the fact that singular values are stable under perturbations, the estimator of that we propose is based on a thresholding rule which essentially counts the number of singular values of a consistent estimator of that are greater than a data-driven threshold. We prove that our estimator of is consistent, and establish non-asymptotic results which provide finite sample performance guarantees for our estimator. We present a Monte Carlo study which shows that our estimator performs well for samples of moderate size.
View on arXiv