State Aggregation in Monte Carlo Tree Search

Jesse Hostetler; Alan Fern; Tom Dietterich

doi:10.1609/aaai.v28i1.9066

Authors

Jesse Hostetler Oregon State University
Alan Fern Oregon State University
Tom Dietterich Oregon State University

DOI:

https://doi.org/10.1609/aaai.v28i1.9066

Keywords:

Monte Carlo tree search, State abstraction, Markov decision processes

Abstract

Monte Carlo tree search (MCTS) algorithms are a popular approach to online decision-making in Markov decision processes (MDPs). These algorithms can, however, perform poorly in MDPs with high stochastic branching factors. In this paper, we study state aggregation as a way of reducing stochastic branching in tree search. Prior work has studied formal properties of MDP state aggregation in the context of dynamic programming and reinforcement learning, but little attention has been paid to state aggregation in MCTS. Our main result is a performance loss bound for a class of value function-based state aggregation criteria in expectimax search trees. We also consider how to construct MCTS algorithms that operate in the abstract state space but require a simulator of the ground dynamics only. We find that trajectory sampling algorithms like UCT can be adapted easily, but that sparse sampling algorithms present difficulties. As a proof of concept, we experimentally confirm that state aggregation can improve the finite-sample performance of UCT.

State Aggregation in Monte Carlo Tree Search

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription