Group Lasso for generalized linear models in high dimension

11 August 2013

Abstract

Nowadays an increasing amount of data is available and we have to deal with models in high dimension (number of covariates much larger than the sample size). Under sparsity assumption it is reasonable to hope that we can make a good estimation of the regression parameter. This sparsity assumption as well as a block structuration of the covariates into groups with similar modes of behavior is for example quite natural in genomics. A huge amount of scientific literature exists for Gaussian linear models including the Lasso estimator and also the Group Lasso estimator which promotes group sparsity under an a priori knowledge of the groups. We extend this Group Lasso procedure to generalized linear models and we study the properties of this estimator for sparse high-dimensional generalized linear models to find convergence rates. We provide oracle inequalities for the prediction and estimation error under assumptions on the covariables and under a condition on the design matrix. We show the ability of this estimator to recover good sparse approximation of the true model. At last we extend these results to the case of an Elastic net penalty and we apply them to the so-called Poisson regression case which has not been studied in this context contrary to the logistic regression.

View on arXiv

Comments on this paper