Fair Model-based Clustering

Authors

  • Jinwon Park Seoul National University
  • Kunwoong Kim Seoul National University
  • Jihu Lee Seoul National University
  • Yongdai Kim Seoul National University

DOI:

https://doi.org/10.1609/aaai.v40i29.39663

Abstract

The goal of fair clustering is to find clusters such that the proportion of sensitive attributes (e.g., gender, race, etc) in each cluster is similar to the proportion of the entire data. Various fair clustering algorithms have been proposed, which modify standard K-means clustering to satisfy a given fairness constraint. A critical limitation of several existing fair clustering algorithms is that the number of parameters to be learned is proportional to the sample size because the cluster assignment of each datum should be optimized simultaneously with the cluster center, and thus scaling up the algorithms is difficult. In this paper, we propose a new fair clustering algorithm based on finite mixture model called Fair Model-based Clustering (FMC). A main advantage of FMC is that the number of learnable parameters is independent to the sample size and thus can be scaled up easily. In particular, a mini-batch learning is possible to obtain clusters that are approximately fair. Moreover, FMC can be applied to non-metric data (e.g., categorical data) as long as the likelihood is well-defined. Theoretical and empirical justifications of the superiority of the proposed algorithm are provided.

Published

2026-03-14

How to Cite

Park, J., Kim, K., Lee, J., & Kim, Y. (2026). Fair Model-based Clustering. Proceedings of the AAAI Conference on Artificial Intelligence, 40(29), 24773–24781. https://doi.org/10.1609/aaai.v40i29.39663

Issue

Section

AAAI Technical Track on Machine Learning VI