State Aggregation in Monte Carlo Tree Search

Authors

  • Jesse Hostetler Oregon State University
  • Alan Fern Oregon State University
  • Tom Dietterich Oregon State University

DOI:

https://doi.org/10.1609/aaai.v28i1.9066

Keywords:

Monte Carlo tree search, State abstraction, Markov decision processes

Abstract

Monte Carlo tree search (MCTS) algorithms are a popular approach to online decision-making in Markov decision processes (MDPs). These algorithms can, however, perform poorly in MDPs with high stochastic branching factors. In this paper, we study state aggregation as a way of reducing stochastic branching in tree search. Prior work has studied formal properties of MDP state aggregation in the context of dynamic programming and reinforcement learning, but little attention has been paid to state aggregation in MCTS. Our main result is a performance loss bound for a class of value function-based state aggregation criteria in expectimax search trees. We also consider how to construct MCTS algorithms that operate in the abstract state space but require a simulator of the ground dynamics only. We find that trajectory sampling algorithms like UCT can be adapted easily, but that sparse sampling algorithms present difficulties. As a proof of concept, we experimentally confirm that state aggregation can improve the finite-sample performance of UCT.

Downloads

Published

2014-06-21

How to Cite

Hostetler, J., Fern, A., & Dietterich, T. (2014). State Aggregation in Monte Carlo Tree Search. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1). https://doi.org/10.1609/aaai.v28i1.9066

Issue

Section

AAAI Technical Track: Reasoning under Uncertainty