TMAE:Learning Targeted Multi-Agent Exploration via Causal Inference

Authors

  • Chuxiong Sun National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences
  • Dunqi Yao National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Rui Wang National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences National Key Laboratory of Complex System Modeling and Simulation Technology
  • Wenwen Qiang National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences
  • Changwen Zheng National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences
  • Jiangmeng Li National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i30.39765

Abstract

Exploration in sparse-reward tasks remains a fundamental challenge in multi-agent reinforcement learning (MARL) due to complex inter-agent interactions and the expansive exploration space. To address this issue, we propose Targeted Multi-Agent Exploration (TMAE), a novel framework that uncovers the causal relationships between the state space and the reward function, thereby reducing the exploration space and enabling more targeted exploration. Specifically, we construct a structural causal model (SCM) to model the causality between sub-state variables and sparse rewards, providing a robust analytical foundation for subsequent causal inference. Through counterfactual causal intervention, TMAE identifies the most critical subspaces for discovering rare but pivotal events while filtering out confounders. By incorporating these causal insights into the exploration process, TMAE prioritizes subspaces with stronger causal effects on sparse rewards, significantly enhancing exploration efficiency. We evaluate TMAE on a range of MARL benchmarks featuring sparse rewards, consistently demonstrating superior exploration efficiency compared to state-of-the-art methods. Furthermore, visualized causal insights derived from TMAE reveal its ability to effectively capture intricate dependencies and priorities in targeted exploration, showcasing strong alignment with prior domain knowledge.

Downloads

Published

2026-03-14

How to Cite

Sun, C., Yao, D., Wang, R., Qiang, W., Zheng, C., & Li, J. (2026). TMAE:Learning Targeted Multi-Agent Exploration via Causal Inference. Proceedings of the AAAI Conference on Artificial Intelligence, 40(30), 25682–25690. https://doi.org/10.1609/aaai.v40i30.39765

Issue

Section

AAAI Technical Track on Machine Learning VII