Scalable Affinity Propagation for Massive Datasets

Authors

  • Hiroaki Shiokawa University of Tsukuba

DOI:

https://doi.org/10.1609/aaai.v35i11.17160

Keywords:

Unsupervised & Self-Supervised Learning, Clustering

Abstract

Affinity Propagation (AP) is a fundamental algorithm to identify clusters included in data objects. Given a similarities among objects, it iteratively performs message updates between all data object pairs until convergence. Although AP yields a higher clustering quality compared with other methods, it is computationally expensive. Hence, it has difficulty handling massive datasets that include numerous data objects. This is because the message updates require a quadratic cost of the number of data objects. Here, we propose a novel fast algorithm, ScaleAP, which outputs the same clusters as AP but within a shorter computation time. ScaleAP dynamically excludes unnecessary message updates without sacrificing its clustering accuracy. Our extensive evaluations demonstrate that ScaleAP outperforms existing AP algorithms in terms of running time by up to two orders of magnitude.

Downloads

Published

2021-05-18

How to Cite

Shiokawa, H. (2021). Scalable Affinity Propagation for Massive Datasets. Proceedings of the AAAI Conference on Artificial Intelligence, 35(11), 9639-9646. https://doi.org/10.1609/aaai.v35i11.17160

Issue

Section

AAAI Technical Track on Machine Learning IV