Accuracy Assessment for High-dimensional Linear Regression

10 March 2016

T. Tony Cai

Abstract

This paper considers point and interval estimation of the $\ell_q$ loss of an estimator in high-dimensional linear regression with random design. Both the setting of known identity design covariance matrix and known noise level and the setting of unknown design covariance matrix and noise level are studied. We establish the minimax convergence rate for estimating the $\ell_{q}$ loss and the minimax expected length of confidence intervals for the $\ell_{q}$ loss of a broad collection of estimators of the regression vector. We also investigate the adaptivity of the confidence intervals for the $\ell_{q}$ loss. The results reveal interesting and significant differences between estimating the $\ell_2$ loss and $\ell_q$ loss with $1\le q <2$ as well as the differences between the two settings. A major step in our analysis is to establish rate sharp lower bounds for the minimax estimation error and the expected length of minimax and adaptive confidence intervals for the $\ell_q$ loss, which requires the development of new technical tools. A significant difference between loss estimation and the traditional parameter estimation is that for loss estimation the constraint is on the performance of the estimator of the regression vector, but the lower bounds are on the difficulty of estimating its $\ell_q$ loss. The technical tools developed in this paper can also be of independent interest.

View on arXiv

Comments on this paper