42
10

Optimal choice of kk for kk-nearest neighbor regression

Abstract

The kk-nearest neighbor algorithm (kk-NN) is a widely used non-parametric method for classification and regression. We study the mean squared error of the kk-NN estimator when kk is chosen by leave-one-out cross-validation (LOOCV). Although it was known that this choice of kk is asymptotically consistent, it was not known previously that it is an optimal kk. We show, with high probability, the mean squared error of this estimator is close to the minimum mean squared error using the kk-NN estimate, where the minimum is over all choices of kk.

View on arXiv
Comments on this paper