On Picking Good Policies: Leveraging Action-Policy Testing in Policy Training

Jan Eisenhut; Daniel Fišer; Isabel Valera; Jörg Hoffmann

doi:10.1609/icaps.v35i1.36116

Authors

Jan Eisenhut Saarland University, Saarland Informatics Campus, Saarbrücken, Germany
Daniel Fišer Aalborg University, Denmark
Isabel Valera Saarland University, Saarland Informatics Campus, Saarbrücken, Germany Max Planck Institute for Software Systems, Saarbrücken, Germany
Jörg Hoffmann Saarland University, Saarland Informatics Campus, Saarbrücken, Germany German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany

DOI:

https://doi.org/10.1609/icaps.v35i1.36116

Abstract

Testing is a natural approach to assess the quality of learned action policies π. Prior work introduced policy testing in AI planning as searching for bugs in π, that is, states where π is sub-optimal with respect to a given testing objective. Beyond quality assurance, an obvious application of these methods is policy selection: given several π to choose from, we can use testing to select the "least buggy" one. Here, we integrate testing-based policy selection into the training process. This includes making more informed decisions when selecting the final policy after training, as well as choosing more promising intermediate policies during the training process. Our experiments with ASNets action policies show that integrating testing allows us to more reliably obtain good-quality policies.

On Picking Good Policies: Leveraging Action-Policy Testing in Policy Training

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information