A Phrase-Based Method for Hierarchical Clustering of Web Snippets

Authors

  • Zhao Li University of Vermont
  • Xindong Wu University of Vermont

DOI:

https://doi.org/10.1609/aaai.v24i1.7773

Abstract

Document clustering has been applied in web information retrieval, which facilitates users’ quick browsing by organizing retrieved results into different groups. Meanwhile, a tree-like hierarchical structure is wellsuited for organizing the retrieved results in favor of web users. In this regard, we introduce a new method for hierarchical clustering of web snippets by exploiting a phrase-based document index. In our method, a hierarchy of web snippets is built based on phrases instead of all snippets, and the snippets are then assigned to the corresponding clusters consisting of phrases. We show that, as opposed to the traditional hierarchical clustering, our method not only presents meaningful cluster labels but also improves clustering performance.

Downloads

Published

2010-07-05

How to Cite

Li, Z., & Wu, X. (2010). A Phrase-Based Method for Hierarchical Clustering of Web Snippets. Proceedings of the AAAI Conference on Artificial Intelligence, 24(1), 1947-1948. https://doi.org/10.1609/aaai.v24i1.7773