Dynamics-Aware Planning Representation for Zero-Shot Reinforcement Learning (Student Abstract)

Authors

  • Jungho An Korea Advanced Institute of Science and Technology
  • Taeyoung Kim Korea Advanced Institute of Science and Technology
  • Haeun Kim Korea Advanced Institute of Science and Technology
  • Dongsoo Har Korea Advanced Institute of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v40i48.42185

Abstract

Offline Zero-Shot Reinforcement Learning requires an agent to solve unseen tasks using only a fixed offline dataset without explicit rewards. A central challenge is learning representations that capture both high-level long-term planning and low-level physical dynamics. We propose a novel framework, Dynamics-Aware Planning Representation (DAPR), which disentangles these two aspects via complementary contrastive objectives. Specifically, DAPR learns goal-oriented planning directions and local dynamics-consistent directions in the latent space. By jointly enforcing these constraints, DAPR yields representations that balance “where to go” with “how to move.” Experiments on standard locomotion benchmarks (Walker, Cheetah, Quadruped) demonstrate that DAPR consistently improves performance and generalization over strong baselines, achieving substantial gains on precision demanding tasks.

Downloads

Published

2026-03-14

How to Cite

An, J., Kim, T., Kim, H., & Har, D. (2026). Dynamics-Aware Planning Representation for Zero-Shot Reinforcement Learning (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41124–41125. https://doi.org/10.1609/aaai.v40i48.42185