Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles

Authors

  • Youssouf Emine Department of Mathematics and Industrial Engineering, Polytechnique Montréal Canada Excellence Research Chair in Data-Science for Real-time Decision-Making (CERC)
  • Alexandre Forel Department of Mathematics and Industrial Engineering, Polytechnique Montréal CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains
  • Idriss Malek Department of Mathematics and Industrial Engineering, Polytechnique Montréal CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains
  • Thibaut Vidal Department of Mathematics and Industrial Engineering, Polytechnique Montréal CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains

DOI:

https://doi.org/10.1609/aaai.v39i16.33811

Abstract

Tree ensembles, including boosting methods, are highly effective and widely used for tabular data. However, large ensembles lack interpretability and require longer inference times. We introduce a method to prune a tree ensemble into a reduced version that is "functionally identical" to the original model. In other words, our method guarantees that the prediction function stays unchanged for any possible input. As a consequence, this pruning algorithm is lossless for any aggregated metric. We formalize the problem of functionally identical pruning on ensembles, introduce an exact optimization model, and provide a fast yet highly effective method to prune large ensembles. Our algorithm iteratively prunes considering a finite set of points, which is incrementally augmented using an adversarial model. In multiple computational experiments, we show that our approach provides a "free lunch", significantly reducing the ensemble size without altering the model's behavior. Thus, we can preserve state-of-the-art performance at a fraction of the original model's size.

Downloads

Published

2025-04-11

How to Cite

Emine, Y., Forel, A., Malek, I., & Vidal, T. (2025). Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles. Proceedings of the AAAI Conference on Artificial Intelligence, 39(16), 16488–16495. https://doi.org/10.1609/aaai.v39i16.33811

Issue

Section

AAAI Technical Track on Machine Learning II