Latent State-Predictive Exploration for Deep Reinforcement Learning

Authors

  • Yiming Wang University of Macau
  • Kaiyan Zhao Wuhan University
  • Borong Zhang University of Macau
  • Yan Li Shenzhen Polytechnic University
  • Leong Hou U University of Macau

DOI:

https://doi.org/10.1609/aaai.v40i31.39875

Abstract

Reinforcement learning (RL) has achieved promising results in continuous control tasks, where efficient exploration of the state space is crucial for success. However, many recent RL approaches still struggle with sample inefficiency and insufficient exploration for long-horizon tasks, particularly in environments characterized by high-dimensional and complex state spaces. To address these challenges, we propose a novel exploration framework, Latent State Predictive Exploration (LSPE). The core idea behind LSPE is to endow the agent with a form of ``foresight" to enhance exploration in long-horizon settings. Specifically, LSPE employs a state encoder to learn compact latent representations from high-dimensional visual observations, effectively filtering out irrelevant or noisy information. To further enrich and stabilize these representations, we incorporate a diffusion-based self-predictive module that enforces temporal consistency by predicting future states, thereby improving both exploration and downstream predictive control. Additionally, we introduce an Exploration Reward Function (ERF) that explicitly encourages the agent to visit novel latent states. This reward signal promotes more efficient and scalable exploration in complex environments. We evaluate LSPE across a diverse set of challenging long-horizon navigation and manipulation tasks, spanning simulation environments such as Habitat and Robosuite, as well as deployment on a real robot in a **physical indoor environment**. Experimental results show that LSPE substantially enhances exploration efficiency and scales effectively to complex, high-dimensional tasks.

Downloads

Published

2026-03-14

How to Cite

Wang, Y., Zhao, K., Zhang, B., Li, Y., & U, L. H. (2026). Latent State-Predictive Exploration for Deep Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(31), 26661–26669. https://doi.org/10.1609/aaai.v40i31.39875

Issue

Section

AAAI Technical Track on Machine Learning VIII