Exact Clustering via Integer Programming and Maximum Satisfiability
Keywords:clustering, exact algorithms, mathematical programming, integer linear programming, constraint satisfaction, maximum satisfiability
We consider the following general graph clustering problem: given a complete undirected graph G=(V,E,c) with an edge weight function c:E->Q, we are asked to find a partition C of V that maximizes the sum of edge weights within the clusters in C. Owing to its high generality, this problem has a wide variety of real-world applications, including correlation clustering, group technology, and community detection. In this study, we investigate the design of mathematical programming formulations and constraint satisfaction formulations for the problem. First, we present a novel integer linear programming (ILP) formulation that has far fewer constraints than the standard ILP formulation by Groetschel and Wakabayashi (1989). Second, we propose an ILP-based exact algorithm that solves an ILP problem obtained by modifying our above ILP formulation and then performs simple post-processing to produce an optimal solution to the original problem. Third, we present maximum satisfiability (MaxSAT) counterparts of both our ILP formulation and ILP-based exact algorithm. Computational experiments using well-known real-world datasets demonstrate that our ILP-based approaches and their MaxSAT counterparts are highly effective in terms of both memory efficiency and computation time.