Optimal choice of $k$ for $k$ -nearest neighbor regression

12 September 2019

Abstract

The $k$ -nearest neighbor algorithm ( $k$ -NN) is a widely used non-parametric method for classification and regression. We study the mean squared error of the $k$ -NN estimator when $k$ is chosen by leave-one-out cross-validation (LOOCV). Although it was known that this choice of $k$ is asymptotically consistent, it was not known previously that it is an optimal $k$ . We show, with high probability, the mean squared error of this estimator is close to the minimum mean squared error using the $k$ -NN estimate, where the minimum is over all choices of $k$ .

View on arXiv

Comments on this paper

Optimal choice of kkk for kkk-nearest neighbor regression

Optimal choice of $k$ for $k$ -nearest neighbor regression