On the number of variables to use in principal component regression

Abstract
We study least squares linear regression over uncorrelated Gaussian features that are selected in order of decreasing variance. When the number of selected features is at most the sample size , the estimator under consideration coincides with the principal component regression estimator; when , the estimator is the least norm solution over the selected features. We give an average-case analysis of the out-of-sample prediction error as with and , for some constants and . In this average-case setting, the prediction error exhibits a "double descent" shape as a function of . We also establish conditions under which the minimum risk is achieved in the interpolating () regime.
View on arXivComments on this paper