Adversarial Policy Switching with Application to RTS Games

Brian King; Alan Fern; Jesse Hostetler

doi:10.1609/aiide.v8i3.12549

Authors

Brian King Oregon State University
Alan Fern Oregon State University
Jesse Hostetler Oregon State University

DOI:

https://doi.org/10.1609/aiide.v8i3.12549

Keywords:

Markov game, Real-time strategy game, policy switching

Abstract

Complex games such as RTS games are naturally formalized as Markov games. Given a Markov game, it is often possible to hand-code or learn a set of policies that capture the diversity of possible strategies. It is also often possible to hand-code or learn an abstract simulator of the game that can estimate the outcome of playing two strategies against one another from any state. We consider how to use such policy sets and simulators to make decisions in large Markov games. Prior work has considered the problem using an approach we call minimax policy switching. At each decision epoch, all policy pairs are simulated against each other from the current state, and the minimax policy is chosen and used to select actions until the next decision epoch. While intuitively appealing, we show that this switching policy can have arbitrarily poor worst case performance. In response, we describe a modified algorithm, monotone policy switching, whose worst case performance, under certain conditions, is provably no worse than the minimax fixed policy in the set. We evaluate these switching policies in both a simulated RTS game and the real game Wargus. The results show the effectiveness of policy switching when the simulator is accurate, and also highlight challenges in the face of inaccurate simulations.

Adversarial Policy Switching with Application to RTS Games

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information