Dynamic Automaton-Guided Reward Shaping for Monte Carlo Tree Search

Alvaro Velasquez; Brett Bissey; Lior Barak; Andre Beckus; Ismail Alkhouri; Daniel Melcer; George Atia

doi:10.1609/aaai.v35i13.17427

Authors

Alvaro Velasquez Air Force Research Laboratory
Brett Bissey University of Central Florida
Lior Barak University of Central Florida
Andre Beckus Air Force Research Laboratory
Ismail Alkhouri University of Central Florida
Daniel Melcer Northeastern University
George Atia University of Central Florida

DOI:

https://doi.org/10.1609/aaai.v35i13.17427

Keywords:

Planning with Markov Models (MDPs, POMDPs), Sequential Decision Making, Reinforcement Learning, Neuro-Symbolic AI (NSAI)

Abstract

Reinforcement learning and planning have been revolutionized in recent years, due in part to the mass adoption of deep convolutional neural networks and the resurgence of powerful methods to refine decision-making policies. However, the problem of sparse reward signals and their representation remains pervasive in many domains. While various rewardshaping mechanisms and imitation learning approaches have been proposed to mitigate this problem, the use of humanaided artificial rewards introduces human error, sub-optimal behavior, and a greater propensity for reward hacking. In this paper, we mitigate this by representing objectives as automata in order to define novel reward shaping functions over this structured representation. In doing so, we address the sparse rewards problem within a novel implementation of Monte Carlo Tree Search (MCTS) by proposing a reward shaping function which is updated dynamically to capture statistics on the utility of each automaton transition as it pertains to satisfying the goal of the agent. We further demonstrate that such automaton-guided reward shaping can be utilized to facilitate transfer learning between different environments when the objective is the same.

Dynamic Automaton-Guided Reward Shaping for Monte Carlo Tree Search

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription