Optimal variable selection in multi-group sparse discriminant analysis

23 November 2014

Abstract

This article considers the problem of multi-group classification in the setting where the number of variables $p$ is larger than the number of observations $n$ . Several methods have been proposed in the literature that address this problem, however their variable selection performance is either unknown or suboptimal to the results known in the two-group case. In this work we provide sharp conditions for the consistent recovery of relevant variables in the multi-group case using the discriminant analysis proposal of Gaynanova et al., 2014. We achieve the rates of convergence that attain the optimal scaling of the sample size $n$ , number of variables $p$ and the sparsity level $s$ . These rates are significantly faster than the best known results in the multi-group case. Moreover, they coincide with the optimal minimax rates for the two-group case. We validate our theoretical results with numerical analysis.

View on arXiv

Comments on this paper