Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with Distractions
Keywords:ML: Reinforcement Learning Algorithms, ML: Representation Learning
AbstractRecent work has shown that representation learning plays a critical role in sample-efficient reinforcement learning (RL) from pixels. Unfortunately, in real-world scenarios, representation learning is usually fragile to task-irrelevant distractions such as variations in background or viewpoint. To tackle this problem, we propose a novel clustering-based approach, namely Clustering with Bisimulation Metrics (CBM), which learns robust representations by grouping visual observations in the latent space. Specifically, CBM alternates between two steps: (1) grouping observations by measuring their bisimulation distances to the learned prototypes; (2) learning a set of prototypes according to the current cluster assignments. Computing cluster assignments with bisimulation metrics enables CBM to capture task-relevant information, as bisimulation metrics quantify the behavioral similarity between observations. Moreover, CBM encourages the consistency of representations within each group, which facilitates filtering out task-irrelevant information and thus induces robust representations against distractions. An appealing feature is that CBM can achieve sample-efficient representation learning even if multiple distractions exist simultaneously. Experiments demonstrate that CBM significantly improves the sample efficiency of popular visual RL algorithms and achieves state-of-the-art performance on both multiple and single distraction settings. The code is available at https://github.com/MIRALab-USTC/RL-CBM.
How to Cite
Liu, Q., Zhou, Q., Yang, R., & Wang, J. (2023). Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with Distractions. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 8843-8851. https://doi.org/10.1609/aaai.v37i7.26063
AAAI Technical Track on Machine Learning II