Bagging by Design (on the Suboptimality of Bagging)

Authors

  • Periklis Papakonstantinou Tsinghua University
  • Jia Xu Tsinghua University
  • Zhu Cao Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v28i1.9001

Keywords:

bagging, bootstrapping, aggregation, combinatorial design

Abstract

Bagging (Breiman 1996) and its variants is one of the most popular methods in aggregating classifiers and regressors. Originally, its analysis assumed that the bootstraps are built from an unlimited, independent source of samples, therefore we call this form of bagging ideal-bagging. However in the real world, base predictors are trained on data subsampled from a limited number of training samples and thus they behave very differently. We analyze the effect of intersections between bootstraps, obtained by subsampling, to train different base predictors. Most importantly, we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs, and prove it universally better than bagging. Methodologically, we succeed at this level of generality because we compare the prediction accuracy of bagging and design-bagging relative to the accuracy ideal-bagging. This finds potential applications in more involved bagging-based methods. Our analytical results are backed up by experiments on classification and regression settings.

Downloads

Published

2014-06-21

How to Cite

Papakonstantinou, P., Xu, J., & Cao, Z. (2014). Bagging by Design (on the Suboptimality of Bagging). Proceedings of the AAAI Conference on Artificial Intelligence, 28(1). https://doi.org/10.1609/aaai.v28i1.9001

Issue

Section

Main Track: Novel Machine Learning Algorithms