Meta Reinforcement Learning for Heuristic Planing

Ricardo Luna Gutierrez; Matteo Leonetti

doi:10.1609/icaps.v31i1.16003

Authors

Ricardo Luna Gutierrez University of Leeds
Matteo Leonetti University of Leeds

DOI:

https://doi.org/10.1609/icaps.v31i1.16003

Keywords:

Learning Effective Heuristics And Other Forms Of Control Knowledge

Abstract

Heuristic planning has a central role in classical planning applications and competitions. Thanks to this success, there has been an increasing interest in using Deep Learning to create high-quality heuristics in a supervised fashion, learning from optimal solutions of previously solved planning problems. Meta-Reinforcement learning is a fast growing research area concerned with learning, from many tasks, behaviours that can quickly generalize to new tasks from the same distribution of the training ones. We make a connection between meta-reinforcement learning and heuristic planning, showing that heuristic functions meta-learned from planning problems, in a given domain, can outperform both popular domain-independent heuristics, and heuristics learned by supervised learning. Furthermore, while most supervised learning algorithms rely on ad-hoc encodings of the state representation, our method uses as input a general PDDL 3.1 description. We evaluated our heuristic with an A* planner on six domains from the International Planning Competition and the FF Domain Collection, showing that the meta-learned heuristic leads to the expansion, on average, of fewer states than three popular heuristics used by the FastDownward planner, and a supervised-learned heuristic.

Meta Reinforcement Learning for Heuristic Planing

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information