PADiff: Predictive and Adaptive Diffusion Policies for Ad Hoc Teamwork
DOI:
https://doi.org/10.1609/aaai.v40i24.39078Abstract
Ad hoc teamwork (AHT) requires agents to collaborate with previously unseen teammates, which is crucial for many real-world applications. The core challenge of AHT is to develop an ego agent that can predict and adapt to unknown teammates on the fly. Conventional RL-based approaches optimize a single expected return, which often causes policies to collapse into a single dominant behavior, thus failing to capture the multimodal cooperation patterns inherent in AHT. In this work, we introduce PADiff, a diffusion-based approach that captures agent's multimodal behaviors, unlocking its diverse cooperation modes with teammates. However, standard diffusion models lack the ability to predict and adapt in non-stationary AHT scenarios. To address this limitation, we propose a novel diffusion-based policy that integrates critical predictive information about teammates into the denoising process. Extensive experiments across three environments demonstrate that PADiff outperforms existing AHT methods significantly.Published
2026-03-14
How to Cite
Chan, H., Zhang, X., Xiang, A., Zhang, W., & Zhao, M. (2026). PADiff: Predictive and Adaptive Diffusion Policies for Ad Hoc Teamwork. Proceedings of the AAAI Conference on Artificial Intelligence, 40(24), 19943-19951. https://doi.org/10.1609/aaai.v40i24.39078
Issue
Section
AAAI Technical Track on Machine Learning I