Symbolic Task Inference in Deep Reinforcement Learning (Abstract Reprint)

Hosein Hasanbeig; Natasha Yogananda Jeppu; Alessandro Abate; Tom Melham; Daniel Kroening

doi:10.1609/aaai.v40i47.41382

Authors

Hosein Hasanbeig Microsoft Research
Natasha Yogananda Jeppu Department of Computer Science, University of Oxford
Alessandro Abate Department of Computer Science, University of Oxford
Tom Melham Department of Computer Science, University of Oxford
Daniel Kroening Amazon

DOI:

https://doi.org/10.1609/aaai.v40i47.41382

Abstract

This paper proposes DeepSynth, a method for effective training of deep reinforcement learning agents when the reward is sparse or non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives. Our method employs a novel algorithm for synthesis of compact finite state automata to uncover this sequential structure automatically. We synthesise a human-interpretable automaton from trace data collected by exploring the environment. The state space of the environment is then enriched with the synthesised automaton, so that the generation of a control policy by deep reinforcement learning is guided by the discovered structure encoded in the automaton. The proposed approach is able to cope with both high-dimensional, low-level features and unknown sparse or non-Markovian rewards. We have evaluated DeepSynth’s performance in a set of experiments that includes the Atari game Montezuma’s Revenge, known to be challenging. Compared to approaches that rely solely on deep reinforcement learning, we obtain a reduction of two orders of magnitude in the iterations required for policy synthesis, and a significant improvement in scalability.

Symbolic Task Inference in Deep Reinforcement Learning (Abstract Reprint)

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information