55
10

K-Nearest Neighbor Approximation Via the Friend-of-a-Friend Principle

Abstract

Suppose VV is an nn-element set where for each xVx \in V, the elements of V{x}V \setminus \{x\} are ranked by their similarity to xx. The KK-nearest neighbor graph is a directed graph including an arc from each xx to the KK points of V{x}V \setminus \{x\} most similar to xx. Constructive approximation to this graph using far fewer than n2n^2 comparisons is important for the analysis of large high-dimensional data sets. KK-Nearest Neighbor Descent is a parameter-free heuristic where a sequence of graph approximations is constructed, in which second neighbors in one approximation are proposed as neighbors in the next. We provide a rigorous justification for O(nlogn)O( n \log{n} ) complexity of a similar algorithm, using range queries, when applied to a homogeneous Poisson process in suitable dimension, but show that the basic algorithm fails to achieve subquadratic complexity on sets whose similarity rankings arise from a "generic" linear order on the (n2)\binom{n}{2} inter-point distances in a metric space.

View on arXiv
Comments on this paper