Shrinkage estimators for out-of-sample prediction in high-dimensional
linear models
We study the unconditional out-of-sample prediction error (predictive risk) associated with two classes of smooth shrinkage estimators for the linear model: James-Stein type shrinkage estimators and ridge regression estimators. Our study is motivated by problems in high-dimensional data analysis and our results are especially relevant to settings where both the number of predictors and observations are large. Two important aspects of our approach are (i) the data are assumed to be drawn from a multivariate normal distribution and (ii) we take advantage of an asymptotic framework that is appropriate for high-dimensional datasets and offers great simplifications over many existing approaches to studying shrinkage estimators for the linear model. Ultimately, our results comport with classical results and show that significant reductions in out-of-sample prediction error may be had by utilizing shrinkage estimators, as opposed to the ordinary least squares estimator. However, our results also provide a means for a detailed, yet transparent comparative analysis of the different estimators, which helps to shed light on their relative merits. For instance, we utilize results from random matrix theory to obtain explicit closed form expressions for the asymptotic predictive risk of the estimators considered herein (in fact, many of the relevant results are non-asymptotic). Additionally, we identify minimax ridge and James-Stein estimators, which outperform previously proposed shrinkage estimators, and prove that if the population predictor covariance is known -- or if an operator norm-consistent estimator for the population predictor covariance is available -- then the ridge estimator has smaller asymptotic predictive risk than the James-Stein estimator.
View on arXiv