MAPTree: Beating “Optimal” Decision Trees with Bayesian Decision Trees

Authors

  • Colin Sullivan Stanford University
  • Mo Tiwari Stanford University
  • Sebastian Thrun Stanford University

DOI:

https://doi.org/10.1609/aaai.v38i8.28751

Keywords:

DMKM: Rule Mining & Pattern Mining, ML: Bayesian Learning, ML: Classification and Regression, RU: Probabilistic Inference, SO: Applications, SO: Heuristic Search

Abstract

Decision trees remain one of the most popular machine learning models today, largely due to their out-of-the-box performance and interpretability. In this work, we present a Bayesian approach to decision tree induction via maximum a posteriori inference of a posterior distribution over trees. We first demonstrate a connection between maximum a posteriori inference of decision trees and AND/OR search. Using this connection, we propose an AND/OR search algorithm, dubbed MAPTree, which is able to recover the maximum a posteriori tree. Lastly, we demonstrate the empirical performance of the maximum a posteriori tree both on synthetic data and in real world settings. On 16 real world datasets, MAPTree either outperforms baselines or demonstrates comparable performance but with much smaller trees. On a synthetic dataset, MAPTree also demonstrates greater robustness to noise and better generalization than existing approaches. Finally, MAPTree recovers the maxiumum a posteriori tree faster than existing sampling approaches and, in contrast with those algorithms, is able to provide a certificate of optimality. The code for our experiments is available at https://github.com/ThrunGroup/maptree.

Published

2024-03-24

How to Cite

Sullivan, C., Tiwari, M., & Thrun, S. (2024). MAPTree: Beating “Optimal” Decision Trees with Bayesian Decision Trees. Proceedings of the AAAI Conference on Artificial Intelligence, 38(8), 9019-9026. https://doi.org/10.1609/aaai.v38i8.28751

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management