Explaining Deep Reinforcement Learning Agents in the Atari Domain through a Surrogate Model

Alexander Sieusahai; Matthew Guzdial

doi:10.1609/aiide.v17i1.18894

Explaining Deep Reinforcement Learning Agents in the Atari Domain through a Surrogate Model

Authors

Alexander Sieusahai University of Alberta
Matthew Guzdial University of Alberta

DOI:

https://doi.org/10.1609/aiide.v17i1.18894

Keywords:

Explainable AI, Reinforcement Learning, Atari

Abstract

One major barrier to applications of deep Reinforcement Learning (RL) both inside and outside of games is the lack of explainability. In this paper, we describe a lightweight and effective method to derive explanations for deep RL agents, which we evaluate in the Atari domain. Our method relies on a transformation of the pixel-based input of the RL agent to a symbolic, interpretable input representation. We then train a surrogate model, which is itself interpretable, to replicate the behavior of the target, deep RL agent. Our experiments demonstrate that we can learn an effective surrogate that accurately approximates the underlying decision making of a target agent on a suite of Atari games.

Downloads

Published

2021-10-04

How to Cite

Sieusahai, A., & Guzdial, M. (2021). Explaining Deep Reinforcement Learning Agents in the Atari Domain through a Surrogate Model. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 17(1), 82-90. https://doi.org/10.1609/aiide.v17i1.18894

Download Citation

Issue

Vol. 17 No. 1 (2021): Seventeenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment

Section

Full Oral Papers

Explaining Deep Reinforcement Learning Agents in the Atari Domain through a Surrogate Model

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information