Fast Algorithms for Top-k Approximate String Matching

Zhenglu Yang; Jianjun Yu; Masaru Kitsuregawa

doi:10.1609/aaai.v24i1.7527

Fast Algorithms for Top-k Approximate String Matching

Authors

Zhenglu Yang The University of Tokyo
Jianjun Yu Chinese Academy of Sciences
Masaru Kitsuregawa The University of Tokyo

DOI:

https://doi.org/10.1609/aaai.v24i1.7527

Keywords:

top-k, similar string search, efficiency

Abstract

Top-k approximate querying on string collections is an important data analysis tool for many applications, and it has been exhaustively studied. However, the scale of the problem has increased dramatically because of the prevalence of the Web. In this paper, we aim to explore the efficient top-k similar string matching problem. Several efficient strategies are introduced, such as length aware and adaptive q-gram selection. We present a general q-gram based framework and propose two efficient algorithms based on the strategies introduced. Our techniques are experimentally evaluated on three real data sets and show a superior performance.

Downloads

Published

2010-07-05

How to Cite

Yang, Z., Yu, J., & Kitsuregawa, M. (2010). Fast Algorithms for Top-k Approximate String Matching. Proceedings of the AAAI Conference on Artificial Intelligence, 24(1), 1467–1473. https://doi.org/10.1609/aaai.v24i1.7527

Download Citation

Issue

Vol. 24 No. 1 (2010): Twenty-Fourth AAAI Conference on Artificial Intelligence

Section

AAAI Technical Track: AI and the Web

Fast Algorithms for Top-k Approximate String Matching

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information