Complex Instruction Following with Diverse Style Policies in Football Games

Authors

  • Chenglu Sun Sports Products Department, Interactive Entertainment Group, Tencent
  • Shuo Shen Sports Products Department, Interactive Entertainment Group, Tencent
  • Haonan Hu Sports Products Department, Interactive Entertainment Group, Tencent
  • Wei Zhou School of Future Technology, Nanjing University of Information Science and Technology
  • Chen Chen Human Phenome Institute, Fudan University

DOI:

https://doi.org/10.1609/aaai.v40i30.39762

Abstract

Despite advancements in language-controlled reinforcement learning (LC-RL) for basic domains and straightforward commands (e.g., object manipulation and navigation), effectively extending LC-RL to comprehend and execute high-level or abstract instructions in complex, multi-agent environments, such as football games, remains a significant challenge. To address this gap, we introduce Language-Controlled Diverse Style Policies (LCDSP), a novel LC-RL paradigm specifically designed for complex scenarios. LCDSP comprises two key components: a Diverse Style Training (DST) method and a Style Interpreter (SI). The DST method efficiently trains a single policy capable of exhibiting a wide range of diverse behaviors by modulating agent actions through style parameters (SP). The SI is designed to accurately and rapidly translate high-level language instructions into these corresponding SP. Through extensive experiments in a complex 5v5 football environment, we demonstrate that LCDSP effectively comprehends abstract tactical instructions and accurately executes the desired diverse behavioral styles, showcasing its potential for complex, real-world applications.

Published

2026-03-14

How to Cite

Sun, C., Shen, S., Hu, H., Zhou, W., & Chen, C. (2026). Complex Instruction Following with Diverse Style Policies in Football Games. Proceedings of the AAAI Conference on Artificial Intelligence, 40(30), 25654–25662. https://doi.org/10.1609/aaai.v40i30.39762

Issue

Section

AAAI Technical Track on Machine Learning VII