Optimal variable selection in multi-group sparse discriminant analysis

This article considers the problem of multi-group classification in the setting where the number of variables is larger than the number of observations . Several methods have been proposed in the literature that address this problem, however their variable selection performance is either unknown or suboptimal to the results known in the two-group case. In this work we provide sharp conditions for the consistent recovery of relevant variables in the multi-group case using the discriminant analysis proposal of Gaynanova et al., 2014. We achieve the rates of convergence that attain the optimal scaling of the sample size , number of variables and the sparsity level . These rates are significantly faster than the best known results in the multi-group case. Moreover, they coincide with the optimal minimax rates for the two-group case. We validate our theoretical results with numerical analysis.
View on arXiv