Near-optimal Linear Predictive Clustering in Non-separable Spaces via MIP and QPBO Reductions

Authors

  • Jiazhou Liang University of Toronto
  • Hassan Khurram University of Toronto
  • Scott Sanner University of Toronto Vector Institute for Artificial Intelligence

DOI:

https://doi.org/10.1609/aaai.v40i28.39511

Abstract

Linear Predictive Clustering (LPC) partitions samples based on shared linear relationships between feature and target variables, with numerous applications including marketing, medicine, and education. Greedy optimization methods, commonly used for LPC, alternate between clustering and linear regression but lack global optimality. While effective for separable clusters, they struggle in non-separable settings where clusters overlap in feature space. In an alternative constrained optimization paradigm, previous works formulated LPC as a Mixed-Integer Program (MIP), ensuring global optimality regardless of separability but at the cost of poor scalability. This work builds on the constrained optimization paradigm to introduce two novel approaches that improve the efficiency of global optimization for LPC. By leveraging key theoretical properties of separability, we derive near-optimal approximations with provable error bounds, significantly reducing the MIP formulation’s complexity and improving scalability. Additionally, we can further approximate LPC as a Quadratic Pseudo-Boolean Optimization (QPBO) problem, achieving additional computational gains in the special case of two clusters. Comparative analyses on synthetic and real-world datasets demonstrate that our methods consistently achieve near-optimal solutions with substantially lower regression errors than greedy optimization while exhibiting superior scalability over existing MIP formulations.

Published

2026-03-14

How to Cite

Liang, J., Khurram, H., & Sanner, S. (2026). Near-optimal Linear Predictive Clustering in Non-separable Spaces via MIP and QPBO Reductions. Proceedings of the AAAI Conference on Artificial Intelligence, 40(28), 23409–23416. https://doi.org/10.1609/aaai.v40i28.39511

Issue

Section

AAAI Technical Track on Machine Learning V