OneBatchPAM: A Fast and Frugal K-Medoids Algorithm
DOI:
https://doi.org/10.1609/aaai.v39i15.33776Abstract
This paper proposes a novel k-medoids approximation algorithm to handle large-scale datasets with reasonable computational time and memory complexity. We develop a local-search algorithm that iteratively improves the medoid selection based on the estimation of the k-medoids objective. A single batch of size m << n provides the estimation, which reduces the required memory size and the number of pairwise dissimilarities computations to O(mn), instead of O(n^2) compared to most k-medoids baselines. We obtain theoretical results highlighting that a batch of size m = O(log(n)) is sufficient to guarantee, with strong probability, the same performance as the original local-search algorithm. Multiple experiments conducted on real datasets of various sizes and dimensions show that our algorithm provides similar performances as state-of-the-art methods such as FasterPAM and BanditPAM++ with a drastically reduced running time.Downloads
Published
2025-04-11
How to Cite
de Mathelin, A., Cecchi, N. E., Deheeger, F., Mougeot, M., & Vayatis, N. (2025). OneBatchPAM: A Fast and Frugal K-Medoids Algorithm. Proceedings of the AAAI Conference on Artificial Intelligence, 39(15), 16172-16180. https://doi.org/10.1609/aaai.v39i15.33776
Issue
Section
AAAI Technical Track on Machine Learning I