Estimator selection in the Gaussian setting

We consider the problem of estimating the mean of a Gaussian vector with independent components of common unknown variance . Our estimation procedure is based on estimator selection. More precisely, we start with an arbitrary and possibly infinite collection of estimators of based on and, with the same data , aim at selecting an estimator among with the smallest Euclidean risk. No assumptions on the estimators are made and their dependencies with respect to may be unknown. We establish a non-asymptotic risk bound for the selected estimator. As particular cases, our approach allows to handle the problems of aggregation and model selection as well as those of choosing a window and a kernel for estimating a regression function, or tuning the parameter involved in a penalized criterion. We also derive oracle-type inequalities when consists of linear estimators. For illustration, we carry out two simulation studies. One aims at comparing our procedure to cross-validation for choosing a tuning parameter. The other shows how to implement our approach to solve the problem of variable selection in practice.
View on arXiv