Robust Opponent Modeling via Adversarial Ensemble Reinforcement Learning
Keywords:Multi-agent Planning And Learning, Representations For Learned Models In Planning, Learning To Improve The Effectiveness Of Planning & Scheduling Systems, Learning Domain And Action Models For Planning
AbstractThis paper studies decision-making in two-player scenarios where the type (e.g. adversary, neutral, or teammate) of the other agent (opponent) is uncertain to the decision-making agent (protagonist), which is an abstraction of security-domain applications. In these settings, the reward for the protagonist agent depends on the type of the opponent, but this is private information known only to the opponent itself, and thus hidden from the protagonist. In contrast, as is often the case, the type of the protagonist agent is assumed to be known to the opponent, and this information-asymmetry significantly complicates the protagonist's decision-making. In particular, to determine the best actions to take, the protagonist agent must infer the opponent type from the observations and agent modeling. To address this problem, this paper presents an opponent-type deduction module based on Bayes' rule. This inference module takes as input the imagined opponent's decision-making rule (opponent model) as well as the observed opponent's history of actions and states, and outputs a belief over the opponent's hidden type. A multiagent reinforcement learning approach is used to develop this game-theoretic opponent model through self-play, which avoids the expensive data collection step that requires interaction with a real opponent. Besides, this multiagent approach also captures the strategy interaction and reasoning between agents. In addition, we apply ensemble training to avoid over-fitting to a single opponent model during the training. As a result, the learned protagonist policy is also effective against unseen opponents. Experimental results show that the proposed game-theoretic modeling, explicit opponent type inference and the ensemble training significantly improves the decision-making performance over baseline approaches, and generalizes well against adversaries that have not been seen during the training.
How to Cite
Shen, M., & How, J. P. (2021). Robust Opponent Modeling via Adversarial Ensemble Reinforcement Learning. Proceedings of the International Conference on Automated Planning and Scheduling, 31(1), 578-587. Retrieved from https://ojs.aaai.org/index.php/ICAPS/article/view/16006
Special Track on Planning and Learning