Large-Scale Community Detection on YouTube for Topic Discovery and Exploration

Authors

  • Ullas Gargi Google, Inc.
  • Wenjun Lu University of Maryland
  • Vahab Mirrokni Google, Inc.
  • Sangho Yoon Google, Inc.

DOI:

https://doi.org/10.1609/icwsm.v5i1.14191

Abstract

Detecting coherent, well-connected communities in large graphs provides insight into the graph structure and can serve as the basis for content discovery. Clustering is a popular technique for community detection but global algorithms that examine the entire graph do not scale. Local algorithms are highly parallelizable but perform sub-optimally, especially in applications where we need to optimize multiple metrics. We present a multi-stage algorithm based on local-clustering that is highly scalable, combining a pre-processing stage, a lo- cal clustering stage, and a post-processing stage. We apply it to the YouTube video graph to generate named clusters of videos with coherent content. We formalize coverage, co- herence, and connectivity metrics and evaluate the quality of the algorithm for large YouTube graphs. Our use of local algorithms for global clustering, and its implementation and practical evaluation on such a large scale is a first of its kind.

Downloads

Published

2021-08-03

How to Cite

Gargi, U., Lu, W., Mirrokni, V., & Yoon, S. (2021). Large-Scale Community Detection on YouTube for Topic Discovery and Exploration. Proceedings of the International AAAI Conference on Web and Social Media, 5(1), 486-489. https://doi.org/10.1609/icwsm.v5i1.14191