Rate Optimal Estimation and Confidence Intervals for High-dimensional Regression with Missing Covariates

9 February 2017

Aarti Singh

Abstract

We consider the problem of estimating and constructing component-wise confidence intervals of a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random. A variant of the Dantzig selector (Candes & Tao, 2007) is analyzed for estimating the regression model and a de-biasing argument is employed to construct component-wise confidence intervals under additional assumptions on the covariance of the design matrix. We also derive rates of convergence of the mean-square estimation error and the average confidence interval length, and show that the dependency over several model parameters (e.g., sparsity $s$ , portion of observed covariates $\rho_*$ , signal level $\|\beta_0\|_2$ ) are optimal in a minimax sense.

View on arXiv

Comments on this paper