Walking on Minimax Paths for k-NN Search

Kye-Hyeon Kim; Seungjin Choi

doi:10.1609/aaai.v27i1.8588

Authors

Kye-Hyeon Kim Pohang University of Science and Technology
Seungjin Choi Pohang University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v27i1.8588

Keywords:

k-Nearest Neighbor Search, Link-based Dissimilarity, Minimax Distance, Minimax Path

Abstract

Link-based dissimilarity measures, such as shortest path or Euclidean commute time distance, base their distance on paths between nodes of a weighted graph. These measures are known to be better suited to data manifold with nonconvex-shaped clusters, compared to Euclidean distance, so that k-nearest neighbor (NN) search is improved in such metric spaces. In this paper we present a new link-based dissimilarity measure based on minimax paths between nodes. Two main benefits of minimax path-based dissimilarity measure are: (1) only a subset of paths is considered to make it scalable, while Euclidean commute time distance considers all possible paths; (2) it better captures nonconvex-shaped cluster structure, compared to shortest path distance. We define the total cost assigned to a path between nodes as L_p norm of intermediate costs of edges involving the path, showing that minimax path emerges from our L_p norm over paths framework. We also define minimax distance as the intermediate cost of the longest edge on the minimax path, then present a greedy algorithm to compute k smallest minimax distances between a query and N data points in O(log N + k log k) time. Numerical experiments demonstrate that our minimax k-NN algorithm reduce the search time by several orders of magnitude, compared to existing methods, while the quality of k-NN search is significantly improved over Euclidean distance.

Walking on Minimax Paths for k-NN Search

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information