DSAP: Enhancing Generalization in Goal-Conditioned Reinforcement Learning

Authors

  • Yiming Wang University of Macau
  • Kaiyan Zhao Wuhan University
  • Ming Yang Hong Kong Polytechnic University
  • Yan Li Shenzhen Polytechnic University
  • Furui Liu Zhejiang Lab
  • Jiayu Chen University of Hong Kong
  • Leong Hou U University of Macau

DOI:

https://doi.org/10.1609/aaai.v40i31.39877

Abstract

Goal-conditioned Reinforcement Learning (RL) is a promising direction for training agents capable of tackling a variety of tasks. However, generalizing to new goals in different environments remains a central challenge for goal-conditioned RL agents. Existing methods often rely on state abstraction, which involves learning abstracted state representations by excluding irrelevant features, to improve generalization. Despite their success in simplified settings, these methods often fail to generalize effectively to realistic environments with varied goals. In this work, we propose to enhance generalization through state abstraction from the perspective of causal inference. We hypothesize that the generalization gap arises in part due to unobserved confounders: latent variables that simultaneously influence both the global and goal states. To address this, we introduce Deconfounded State Abstraction for Policy learning (DSAP), a novel framework that mitigates backdoor confounding by employing a learned causal graph as a *proxy* for the hidden confounders. We provide theoretical analysis demonstrating that DSAP improves both the learning process and the generalization capability of goal-conditioned policies. Extensive experiments across different settings of multiple benchmarks show that our method significantly outperforms existing methods.

Downloads

Published

2026-03-14

How to Cite

Wang, Y., Zhao, K., Yang, M., Li, Y., Liu, F., Chen, J., & U, L. H. (2026). DSAP: Enhancing Generalization in Goal-Conditioned Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(31), 26679–26687. https://doi.org/10.1609/aaai.v40i31.39877

Issue

Section

AAAI Technical Track on Machine Learning VIII