41
42

Accuracy Assessment for High-dimensional Linear Regression

T. Tony Cai
Abstract

This paper considers point and interval estimation of the q\ell_q loss of an estimator in high-dimensional linear regression with random design. Both the setting of known identity design covariance matrix and known noise level and the setting of unknown design covariance matrix and noise level are studied. We establish the minimax convergence rate for estimating the q\ell_{q} loss and the minimax expected length of confidence intervals for the q\ell_{q} loss of a broad collection of estimators of the regression vector. We also investigate the adaptivity of the confidence intervals for the q\ell_{q} loss. The results reveal interesting and significant differences between estimating the 2\ell_2 loss and q\ell_q loss with 1q<21\le q <2 as well as the differences between the two settings. A major step in our analysis is to establish rate sharp lower bounds for the minimax estimation error and the expected length of minimax and adaptive confidence intervals for the q\ell_q loss, which requires the development of new technical tools. A significant difference between loss estimation and the traditional parameter estimation is that for loss estimation the constraint is on the performance of the estimator of the regression vector, but the lower bounds are on the difficulty of estimating its q\ell_q loss. The technical tools developed in this paper can also be of independent interest.

View on arXiv
Comments on this paper