DSAP: Enhancing Generalization in Goal-Conditioned Reinforcement Learning

Yiming Wang; Kaiyan Zhao; Ming Yang; Yan Li; Furui Liu; Jiayu Chen; Leong Hou U

doi:10.1609/aaai.v40i31.39877

Authors

Yiming Wang University of Macau
Kaiyan Zhao Wuhan University
Ming Yang Hong Kong Polytechnic University
Yan Li Shenzhen Polytechnic University
Furui Liu Zhejiang Lab
Jiayu Chen University of Hong Kong
Leong Hou U University of Macau

DOI:

https://doi.org/10.1609/aaai.v40i31.39877

Abstract

Goal-conditioned Reinforcement Learning (RL) is a promising direction for training agents capable of tackling a variety of tasks. However, generalizing to new goals in different environments remains a central challenge for goal-conditioned RL agents. Existing methods often rely on state abstraction, which involves learning abstracted state representations by excluding irrelevant features, to improve generalization. Despite their success in simplified settings, these methods often fail to generalize effectively to realistic environments with varied goals. In this work, we propose to enhance generalization through state abstraction from the perspective of causal inference. We hypothesize that the generalization gap arises in part due to unobserved confounders: latent variables that simultaneously influence both the global and goal states. To address this, we introduce Deconfounded State Abstraction for Policy learning (DSAP), a novel framework that mitigates backdoor confounding by employing a learned causal graph as a *proxy* for the hidden confounders. We provide theoretical analysis demonstrating that DSAP improves both the learning process and the generalization capability of goal-conditioned policies. Extensive experiments across different settings of multiple benchmarks show that our method significantly outperforms existing methods.

DSAP: Enhancing Generalization in Goal-Conditioned Reinforcement Learning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information