Visual Reinforcement Learning with Residual Action
DOI:
https://doi.org/10.1609/aaai.v39i18.34097Abstract
Learning control policy from continuous action space by visual observations is a fundamental and challenging task in reinforcement learning (RL). An essential problem is how to accurately map the high-dimensional images to the optimal actions by the policy network. Traditional decision-making modules output actions solely based on the current observation, while the distributions of optimal actions are dependent on specific tasks and cannot be known priorly, which increases the learning difficulty. To make the learning easier, we analyze the action characteristics in several control tasks, and propose Reinforcement Learning with Residual Action (ResAct) to explicitly model the adjustments of actions based on the differences between adjacent observations, rather than learning actions directly from observations. The method just redefines the output of the policy network, and doesn’t introduce any prior assumption to constrain or simplify the vanilla control problem. Extensive experiments on DeepMind Control Suite and CARLA demonstrate that the method could improve different RL baselines significantly, and achieve state-of-the-art performance.Published
2025-04-11
How to Cite
Liu, Z., Peng, P., & Tian, Y. (2025). Visual Reinforcement Learning with Residual Action. Proceedings of the AAAI Conference on Artificial Intelligence, 39(18), 19050–19058. https://doi.org/10.1609/aaai.v39i18.34097
Issue
Section
AAAI Technical Track on Machine Learning IV