A General Anchor-Based Framework for Scalable Fair Clustering

Authors

  • Shengfei Wei National University of Defense Technology
  • Suyuan Liu National University of Defense Technology
  • Jun Wang National University of Defense Technology
  • Ke Liang National University of Defense Technology
  • Miaomiao Li Changsha University
  • Lei Luo National University of Defense Technology

DOI:

https://doi.org/10.1609/aaai.v40i32.39894

Abstract

Fair clustering is crucial for mitigating bias in unsupervised learning, yet existing algorithms often suffer from quadratic or super-quadratic computational complexity, rendering them impractical for large-scale datasets. To bridge this gap, we introduce the Anchor-based Fair Clustering Framework (AFCF), a novel, general, and plug-and-play framework that empowers arbitrary fair clustering algorithms with linear-time scalability. Our approach first selects a small but representative set of anchors using a novel fair sampling strategy. Then, any off-the-shelf fair clustering algorithm can be applied to this small anchor set. The core of our framework lies in a novel anchor graph construction module, where we formulate an optimization problem to propagate labels while preserving fairness. This is achieved through a carefully designed group-label joint constraint, which we prove theoretically ensures that the fairness of the final clustering on the entire dataset matches that of the anchor clustering. We solve this optimization efficiently using an ADMM-based algorithm. Extensive experiments on multiple large-scale benchmarks demonstrate that AFCF drastically accelerates state-of-the-art methods, which reduces computational time by orders of magnitude while maintaining strong clustering performance and fairness guarantees.

Published

2026-03-14

How to Cite

Wei, S., Liu, S., Wang, J., Liang, K., Li, M., & Luo, L. (2026). A General Anchor-Based Framework for Scalable Fair Clustering. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 26832–26840. https://doi.org/10.1609/aaai.v40i32.39894

Issue

Section

AAAI Technical Track on Machine Learning IX