Meta Reinforcement Learning for Heuristic Planing


  • Ricardo Luna Gutierrez University of Leeds
  • Matteo Leonetti University of Leeds


Learning Effective Heuristics And Other Forms Of Control Knowledge


Heuristic planning has a central role in classical planning applications and competitions. Thanks to this success, there has been an increasing interest in using Deep Learning to create high-quality heuristics in a supervised fashion, learning from optimal solutions of previously solved planning problems. Meta-Reinforcement learning is a fast growing research area concerned with learning, from many tasks, behaviours that can quickly generalize to new tasks from the same distribution of the training ones. We make a connection between meta-reinforcement learning and heuristic planning, showing that heuristic functions meta-learned from planning problems, in a given domain, can outperform both popular domain-independent heuristics, and heuristics learned by supervised learning. Furthermore, while most supervised learning algorithms rely on ad-hoc encodings of the state representation, our method uses as input a general PDDL 3.1 description. We evaluated our heuristic with an A* planner on six domains from the International Planning Competition and the FF Domain Collection, showing that the meta-learned heuristic leads to the expansion, on average, of fewer states than three popular heuristics used by the FastDownward planner, and a supervised-learned heuristic.




How to Cite

Luna Gutierrez, R., & Leonetti, M. (2021). Meta Reinforcement Learning for Heuristic Planing. Proceedings of the International Conference on Automated Planning and Scheduling, 31(1), 551-559. Retrieved from