Learning Strategy Representation for Imitation Learning in Multi-Agent Games

Authors

  • Shiqi Lei Institute of Automation, Chinese Academy of Sciences
  • Kanghoon Lee Korea Advanced Institute of Science and Technology
  • Linjing Li Institute of Automation, Chinese Academy of Sciences Beijing Wenge Technology Co., Ltd.
  • Jinkyoo Park Korea Advanced Institute of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v39i17.33998

Abstract

The offline datasets for imitation learning (IL) in multi-agent games typically contain player trajectories exhibiting diverse strategies, which necessitate measures to prevent learning algorithms from acquiring undesirable behaviors. Learning representations for these trajectories is an effective approach to depicting the strategies employed by each demonstrator. However, existing learning strategies often require player identification or rely on strong assumptions, which are not appropriate for multi-agent games. Therefore, in this paper, we introduce the Strategy Representation for Imitation Learning (STRIL) framework, which (1) effectively learns strategy representations in multi-agent games, (2) estimates proposed indicators based on these representations, and (3) filters out sub-optimal data using the indicators. STRIL is a plug-in method that can be integrated into existing IL algorithms. We demonstrate the effectiveness of STRIL across competitive multi-agent scenarios, including Two-player Pong, Limit Texas Hold'em, and Connect Four. Our approach successfully acquires strategy representations and indicators, thereby identifying dominant trajectories and significantly enhancing existing IL performance across these environments.

Downloads

Published

2025-04-11

How to Cite

Lei, S., Lee, K., Li, L., & Park, J. (2025). Learning Strategy Representation for Imitation Learning in Multi-Agent Games. Proceedings of the AAAI Conference on Artificial Intelligence, 39(17), 18163–18171. https://doi.org/10.1609/aaai.v39i17.33998

Issue

Section

AAAI Technical Track on Machine Learning III