Using Action-Policy Testing in RL to Reduce the Number of Bugs

Authors

  • Hasan Ferit Eniser Max Planck Institute for Software Systems
  • Songtuan Lin Saarland University
  • Nicola Müller German Research Center for Artificial Intelligence (DFKI)
  • Anastasia Isychev TU Wien
  • Valentin Wüstholz ConsenSys
  • Isabel Valera Saarland University
  • Jörg Hoffmann Saarland University German Research Center for Artificial Intelligence (DFKI)
  • Maria Christakis TU Wien

DOI:

https://doi.org/10.1609/socs.v18i1.35990

Abstract

Reinforcement learning is becoming ever more prominent in solving combinatorial search problems, in particular ones where states are images. Prior work has devised action-policy testing methodology, that identifies so-called bug states where policy performance is sub-optimal. Here we show how to leverage this methodology during the RL process, using action-policy testing to find bugs and injecting those as alternate start states for the training runs. Running experiments across six 2D games, we find that our testing-guided training often achieves similar expected reward while reducing the number of bugs.

Downloads

Published

2025-07-20

How to Cite

Eniser, H. F., Lin, S., Müller, N., Isychev, A., Wüstholz, V., Valera, I., … Christakis, M. (2025). Using Action-Policy Testing in RL to Reduce the Number of Bugs. Proceedings of the International Symposium on Combinatorial Search, 18(1), 181–185. https://doi.org/10.1609/socs.v18i1.35990