Bagging by Design (on the Suboptimality of Bagging)

Periklis Papakonstantinou; Jia Xu; Zhu Cao

doi:10.1609/aaai.v28i1.9001

Authors

Periklis Papakonstantinou Tsinghua University
Jia Xu Tsinghua University
Zhu Cao Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v28i1.9001

Keywords:

bagging, bootstrapping, aggregation, combinatorial design

Abstract

Bagging (Breiman 1996) and its variants is one of the most popular methods in aggregating classifiers and regressors. Originally, its analysis assumed that the bootstraps are built from an unlimited, independent source of samples, therefore we call this form of bagging ideal-bagging. However in the real world, base predictors are trained on data subsampled from a limited number of training samples and thus they behave very differently. We analyze the effect of intersections between bootstraps, obtained by subsampling, to train different base predictors. Most importantly, we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs, and prove it universally better than bagging. Methodologically, we succeed at this level of generality because we compare the prediction accuracy of bagging and design-bagging relative to the accuracy ideal-bagging. This finds potential applications in more involved bagging-based methods. Our analytical results are backed up by experiments on classification and regression settings.

Bagging by Design (on the Suboptimality of Bagging)

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription