Formalizing Hierarchical Clustering as Integer Linear Programming

Authors

  • Sean Gilpin University of California, Davis
  • Siegried Nijssen Katholieke Universiteit Leuven
  • Ian Davidson University of California, Davis

DOI:

https://doi.org/10.1609/aaai.v27i1.8671

Keywords:

Hierarchical Clustering

Abstract

Hierarchical clustering is typically implemented as a greedy heuristic algorithm with no explicit objective function. In this work we formalize hierarchical clustering as an integer linear programming (ILP) problem with a natural objective function and the dendrogram properties enforced as linear constraints.  Though exact solvers exists for ILP we show that a simple randomized algorithm and a linear programming (LP) relaxation can be used to provide approximate solutions faster.  Formalizing hierarchical clustering also has the benefit that relaxing the constraints can produce novel problem variations such as overlapping clusterings.  Our experiments show that our formulation is capable of outperforming standard agglomerative clustering algorithms in a variety of settings, including traditional hierarchical clustering as well as learning overlapping clusterings.

Downloads

Published

2013-06-30

How to Cite

Gilpin, S., Nijssen, S., & Davidson, I. (2013). Formalizing Hierarchical Clustering as Integer Linear Programming. Proceedings of the AAAI Conference on Artificial Intelligence, 27(1), 372-378. https://doi.org/10.1609/aaai.v27i1.8671