Walking on Minimax Paths for k-NN Search

Authors

  • Kye-Hyeon Kim Pohang University of Science and Technology
  • Seungjin Choi Pohang University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v27i1.8588

Keywords:

k-Nearest Neighbor Search, Link-based Dissimilarity, Minimax Distance, Minimax Path

Abstract

Link-based dissimilarity measures, such as shortest path or Euclidean commute time distance, base their distance on paths between nodes of a weighted graph. These measures are known to be better suited to data manifold with nonconvex-shaped clusters, compared to Euclidean distance, so that k-nearest neighbor (NN) search is improved in such metric spaces. In this paper we present a new link-based dissimilarity measure based on minimax paths between nodes. Two main benefits of minimax path-based dissimilarity measure are: (1) only a subset of paths is considered to make it scalable, while Euclidean commute time distance considers all possible paths; (2) it better captures nonconvex-shaped cluster structure, compared to shortest path distance. We define the total cost assigned to a path between nodes as Lp norm of intermediate costs of edges involving the path, showing that minimax path emerges from our Lp norm over paths framework. We also define minimax distance as the intermediate cost of the longest edge on the minimax path, then present a greedy algorithm to compute k smallest minimax distances between a query and N data points in O(log N + k log k) time. Numerical experiments demonstrate that our minimax k-NN algorithm reduce the search time by several orders of magnitude, compared to existing methods, while the quality of k-NN search is significantly improved over Euclidean distance.

Downloads

Published

2013-06-30

How to Cite

Kim, K.-H., & Choi, S. (2013). Walking on Minimax Paths for k-NN Search. Proceedings of the AAAI Conference on Artificial Intelligence, 27(1), 518-525. https://doi.org/10.1609/aaai.v27i1.8588