SoF: Soft-Cluster Matrix Factorization for Probabilistic Clustering


  • Han Zhao University of Waterloo
  • Pascal Poupart University of Waterloo
  • Yongfeng Zhang Tsinghua University
  • Martin Lysy University of Waterloo



Nonnegative matrix factorization, Probabilistic clustering, Optimization


We propose SoF (Soft-cluster matrix Factorization), a probabilistic clustering algorithm which softly assigns each data point into clusters. Unlike model-based clustering algorithms, SoF does not make assumptions about the data density distribution. Instead, we take an axiomatic approach to define 4 properties that the probability of co-clustered pairs of points should satisfy. Based on the properties, SoF utilizes a distance measure between pairs of points to induce the conditional co-cluster probabilities. The objective function in our framework establishes an important connection between probabilistic clustering and constrained symmetric Nonnegative Matrix Factorization (NMF), hence providing a theoretical interpretation for NMF-based clustering algorithms. To optimize the objective, we derive a sequential minimization algorithm using a penalty method. Experimental results on both synthetic and real-world datasets show that SoF significantly outperforms previous NMF-based algorithms and that it is able to detect non-convex patterns as well as cluster boundaries.




How to Cite

Zhao, H., Poupart, P., Zhang, Y., & Lysy, M. (2015). SoF: Soft-Cluster Matrix Factorization for Probabilistic Clustering. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1).



Main Track: Novel Machine Learning Algorithms