MEPSI: An MDL-Based Ensemble Pruning Approach with Structural Information

Authors

  • Xiao-Dong Bi Nanjing University
  • Shao-Qun Zhang Nanjing University
  • Yuan Jiang Nanjing University

DOI:

https://doi.org/10.1609/aaai.v38i10.28984

Keywords:

ML: Ensemble Methods, ML: Information Theory, ML: Learning Theory

Abstract

Ensemble pruning that combines a subset of individual learners generated in parallel to make predictions is an important topic in ensemble learning. Past decades have developed a lot of pruning algorithms that focus on the external behavior of learners on samples, which may lead to over-fitting. In this paper, we conjecture that the generalization performance of an ensemble is not only related to its external behavior on samples but also dependent on the internal structure of individual learners. We propose the general MEPSI approach based on Kolmogorov complexity and the Minimum Description Length (MDL) principle, which formulates the ensemble pruning task as the two-objective optimization problem that comprises the empirical error and structural information among individual learners. We also provide a concrete implementation of MEPSI on decision trees. The theoretical results provide generalization bounds for both the general MEPSI approach and tree-based implementation. The comparative experiments conducted on multiple real-world data sets demonstrate the effectiveness of our proposed method.

Published

2024-03-24

How to Cite

Bi, X.-D., Zhang, S.-Q., & Jiang, Y. (2024). MEPSI: An MDL-Based Ensemble Pruning Approach with Structural Information. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11078-11086. https://doi.org/10.1609/aaai.v38i10.28984

Issue

Section

AAAI Technical Track on Machine Learning I