K-Nearest Neighbor Approximation Via the Friend-of-a-Friend Principle

Suppose is an -element set where for each , the elements of are ranked by their similarity to . The -nearest neighbor graph % is a directed graph including an arc from each to the points of most similar to . Constructive approximation to this graph using far fewer than comparisons is important for the analysis of large high-dimensional data sets. \emph{-Nearest Neighbor Descent} is a parameter-free heuristic where a sequence of graph approximations is constructed, in which second neighbors in one approximation are proposed as neighbors in the next. We provide a rigorous justification for complexity of a similar algorithm, using range queries, when applied to a homogeneous Poisson process in suitable dimension, but show that the basic algorithm fails to achieve subquadratic complexity on sets whose similarity rankings arise from a "generic" linear order on the inter-point distances in a metric space.
View on arXiv