TMAE:Learning Targeted Multi-Agent Exploration via Causal Inference

Chuxiong Sun; Dunqi Yao; Rui Wang; Wenwen Qiang; Changwen Zheng; Jiangmeng Li

doi:10.1609/aaai.v40i30.39765

Authors

Chuxiong Sun National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences
Dunqi Yao National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
Rui Wang National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences National Key Laboratory of Complex System Modeling and Simulation Technology
Wenwen Qiang National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences
Changwen Zheng National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences
Jiangmeng Li National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i30.39765

Abstract

Exploration in sparse-reward tasks remains a fundamental challenge in multi-agent reinforcement learning (MARL) due to complex inter-agent interactions and the expansive exploration space. To address this issue, we propose Targeted Multi-Agent Exploration (TMAE), a novel framework that uncovers the causal relationships between the state space and the reward function, thereby reducing the exploration space and enabling more targeted exploration. Specifically, we construct a structural causal model (SCM) to model the causality between sub-state variables and sparse rewards, providing a robust analytical foundation for subsequent causal inference. Through counterfactual causal intervention, TMAE identifies the most critical subspaces for discovering rare but pivotal events while filtering out confounders. By incorporating these causal insights into the exploration process, TMAE prioritizes subspaces with stronger causal effects on sparse rewards, significantly enhancing exploration efficiency. We evaluate TMAE on a range of MARL benchmarks featuring sparse rewards, consistently demonstrating superior exploration efficiency compared to state-of-the-art methods. Furthermore, visualized causal insights derived from TMAE reveal its ability to effectively capture intricate dependencies and priorities in targeted exploration, showcasing strong alignment with prior domain knowledge.

TMAE:Learning Targeted Multi-Agent Exploration via Causal Inference

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information