A Global Occlusion-Aware Approach to Self-Supervised Monocular Visual Odometry

Authors

  • Yao Lu Renmin University of China Beijing Key Laboratory of Big Data Management and Analysis Methods
  • Xiaoli Xu Renmin University of China Beijing Key Laboratory of Big Data Management and Analysis Methods
  • Mingyu Ding The University of Hong Kong
  • Zhiwu Lu Renmin University of China Beijing Key Laboratory of Big Data Management and Analysis Methods
  • Tao Xiang University of Surrey

Keywords:

Vision for Robotics & Autonomous Driving

Abstract

Self-Supervised monocular visual odometry (VO) is often cast into a view synthesis problem based on depth and camera pose estimation. One of the key challenges is to accurately and robustly estimate depth with occlusions and moving objects in the scene. Existing methods simply detect and mask out regions of occlusions locally by several convolutional layers, and then perform only partial view synthesis in the rest of the image. However, occlusion and moving object detection is an unsolved problem itself which requires global layout information. Inaccurate detection inevitably results in incorrect depth as well as pose estimation. In this work, instead of locally detecting and masking out occlusions and moving objects, we propose to alleviate their negative effects on monocular VO implicitly but more effectively from two global perspectives. First, a multi-scale non-local attention module, consisting of both intra-stage augmented attention and cascaded across-stage attention, is proposed for robust depth estimation given occlusions, alleviating the impacts of occlusions via global attention modeling. Second, adversarial learning is introduced in view synthesis for monocular VO. Unlike existing methods that use pixel-level losses on the quality of synthesized views, we enforce the synthetic view to be indistinguishable from the real one at the scene-level. Such a global constraint again helps cope with occluded and moving regions. Extensive experiments on the KITTI dataset show that our approach achieves new state-of-the-art in both pose estimation and depth recovery.

Downloads

Published

2021-05-18

How to Cite

Lu, Y., Xu, X., Ding, M., Lu, Z., & Xiang, T. (2021). A Global Occlusion-Aware Approach to Self-Supervised Monocular Visual Odometry. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3), 2260-2268. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16325

Issue

Section

AAAI Technical Track on Computer Vision II