Visual Reinforcement Learning with Residual Action

Zhenxian Liu; Peixi Peng; Yonghong Tian

doi:10.1609/aaai.v39i18.34097

Authors

Zhenxian Liu National Engineering Research Center of Visual Technology, School of Computer Science, Peking University, China
Peixi Peng School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, China Peng Cheng Laboratory, China
Yonghong Tian National Engineering Research Center of Visual Technology, School of Computer Science, Peking University, China School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, China Peng Cheng Laboratory, China

DOI:

https://doi.org/10.1609/aaai.v39i18.34097

Abstract

Learning control policy from continuous action space by visual observations is a fundamental and challenging task in reinforcement learning (RL). An essential problem is how to accurately map the high-dimensional images to the optimal actions by the policy network. Traditional decision-making modules output actions solely based on the current observation, while the distributions of optimal actions are dependent on specific tasks and cannot be known priorly, which increases the learning difficulty. To make the learning easier, we analyze the action characteristics in several control tasks, and propose Reinforcement Learning with Residual Action (ResAct) to explicitly model the adjustments of actions based on the differences between adjacent observations, rather than learning actions directly from observations. The method just redefines the output of the policy network, and doesn’t introduce any prior assumption to constrain or simplify the vanilla control problem. Extensive experiments on DeepMind Control Suite and CARLA demonstrate that the method could improve different RL baselines significantly, and achieve state-of-the-art performance.

Visual Reinforcement Learning with Residual Action

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information