REMOTE: Reinforced Motion Transformation Network for Semi-supervised 2D Pose Estimation in Videos

Authors

  • Xianzheng Ma Wuhan University Singapore University of Technology and Design
  • Hossein Rahmani Lancaster University
  • Zhipeng Fan New York University
  • Bin Yang Wuhan University
  • Jun Chen Wuhan University
  • Jun Liu Singapore University of Technology and Design

DOI:

https://doi.org/10.1609/aaai.v36i2.20089

Keywords:

Computer Vision (CV)

Abstract

Existing approaches for 2D pose estimation in videos often require a large number of dense annotations, which are costly and labor intensive to acquire. In this paper, we propose a semi-supervised REinforced MOtion Transformation nEtwork (REMOTE) to leverage a few labeled frames and temporal pose variations in videos, which enables effective learning of 2D pose estimation in sparsely annotated videos. Specifically, we introduce a Motion Transformer (MT) module to perform cross frame reconstruction, aiming to learn motion dynamic knowledge in videos. Besides, a novel reinforcement learning-based Frame Selection Agent (FSA) is designed within our framework, which is able to harness informative frame pairs on the fly to enhance the pose estimator under our cross reconstruction mechanism. We conduct extensive experiments that show the efficacy of our proposed REMOTE framework.

Downloads

Published

2022-06-28

How to Cite

Ma, X., Rahmani, H., Fan, Z., Yang, B., Chen, J., & Liu, J. (2022). REMOTE: Reinforced Motion Transformation Network for Semi-supervised 2D Pose Estimation in Videos. Proceedings of the AAAI Conference on Artificial Intelligence, 36(2), 1944-1952. https://doi.org/10.1609/aaai.v36i2.20089

Issue

Section

AAAI Technical Track on Computer Vision II