Using Multi-Armed Bandits to Dynamically Update Player Models in an Experience Managed Environment


  • Anton Vinogradov University of Kentucky
  • Brent Harrison University of Kentucky



Player Modeling, Multi-Armed Bandits, Experience Management


Players are often considered to be static in their preferred play styles, but this is often untrue. While in most games this is not an issue, in games where experience managers (ExpMs) control the experience, a shift in a player's preferences can lead to loss of engagement and churn. When an ExpM makes changes to the game world, the game world is now biased in favor of the current player model which will then influence how the ExpM will observe the player's actions, potentially leading to a biased and incorrect player model. In these situations, it is beneficial for the ExpM to recalculate the player model in an efficient manner. In this paper we show that we can use the techniques used to solve multi-armed bandits along with our own idea of distractions to minimize the time it takes to identify what a player's preferences are after they change, compensate for the bias of the game world, and to minimize the number of intrusive elements added to the game world. To evaluate these claims, we use a text-only interactive fiction environment specifically created to be experience managed and to exhibit bias. Our experiments show that multi-armed bandit algorithms can quickly recalculate a player model in response to shifts in a player's preferences compared to several baseline methods.




How to Cite

Vinogradov, A., & Harrison, B. (2022). Using Multi-Armed Bandits to Dynamically Update Player Models in an Experience Managed Environment. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 18(1), 207-214.