Improving Exploration in UCT Using Local Manifolds

Sriram Srinivasan; Erik Talvitie; Michael Bowling

doi:10.1609/aaai.v29i1.9660

Improving Exploration in UCT Using Local Manifolds

Authors

Sriram Srinivasan University of Alberta
Erik Talvitie Franklin and Marshal College
Michael Bowling University of Alberta

DOI:

https://doi.org/10.1609/aaai.v29i1.9660

Keywords:

UCT, Exploration, Manifolds

Abstract

Monte-Carlo planning has been proven successful in manysequential decision-making settings, but it suffers from poorexploration when the rewards are sparse. In this paper, weimprove exploration in UCT by generalizing across similarstates using a given distance metric. We show that this algorithm,like UCT, converges asymptotically to the optimalaction. When the state space does not have a natural distancemetric, we show how we can learn a local manifold from thetransition graph of states in the near future. to obtain a distancemetric. On domains inspired by video games, empiricalevidence shows that our algorithm is more sample efficientthan UCT, particularly when rewards are sparse.

Downloads

Published

2015-03-04

How to Cite

Srinivasan, S., Talvitie, E., & Bowling, M. (2015). Improving Exploration in UCT Using Local Manifolds. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9660

Download Citation

Issue

Vol. 29 No. 1 (2015): Twenty-Ninth AAAI Conference on Artificial Intelligence

Section

Main Track: Planning and Scheduling

Improving Exploration in UCT Using Local Manifolds

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information