Optimal choice of for -nearest neighbor regression

Abstract
The -nearest neighbor algorithm (-NN) is a widely used non-parametric method for classification and regression. We study the mean squared error of the -NN estimator when is chosen by leave-one-out cross-validation (LOOCV). Although it was known that this choice of is asymptotically consistent, it was not known previously that it is an optimal . We show, with high probability, the mean squared error of this estimator is close to the minimum mean squared error using the -NN estimate, where the minimum is over all choices of .
View on arXivComments on this paper