OAMaskFlow: Occlusion-Aware Motion Mask for Scene Flow

Authors

  • Xiongfeng Peng Samsung R&D Institute China-Beijing, China
  • Zhihua Liu Samsung R&D Institute China-Beijing, China
  • Weiming Li Samsung R&D Institute China-Beijing, China
  • Yamin Mao Samsung R&D Institute China-Beijing, China
  • Qiang Wang Samsung R&D Institute China-Beijing, China

DOI:

https://doi.org/10.1609/aaai.v39i6.32696

Abstract

The scene flow estimation methods make significant progress by estimating pixel-wise 3D motion on implicitly learning a motion embedding using an end-to-end differentiable optimization framework. However, the motion embedding learned implicitly is insufficient for grouping pixels into rigid object in challenging regions, such as occlusion and inconsistent multi-view geometric properties. To address this issue, we propose a novel method for estimating scene flow called OAMaskFlow, which has three novelties. Firstly, we propose the concept of occlusion-aware motion (OAM) mask and generate the ground truth annotation through the photo-metric and geometry consistency. Secondly, we propose to supervise the motion embedding with the OAM mask to learn informative and reliable motion representation of the scene. Finally, a 3D motion propagation module is proposed to propagate high-quality 3D motion from reliable pixels to the challenging occluded regions. Experiments show that our proposed OAMaskFlow has reduced the EPE3D metric by 21.0% on the FlyingThings3D dataset and decreased SF-all metric by 24.3% on the KITTI scene flow benchmark than the baseline method RAFT-3D. Furthermore, we apply our proposed OAM mask in simultaneous localization and mapping (SLAM) to improve a state-of-the-art method DROID-SLAM. In comparison, the ATE metric has decreased by 65.7% and 58.3% on the TartanAir monocular and stereo datasets respectively.

Downloads

Published

2025-04-11

How to Cite

Peng, X., Liu, Z., Li, W., Mao, Y., & Wang, Q. (2025). OAMaskFlow: Occlusion-Aware Motion Mask for Scene Flow. Proceedings of the AAAI Conference on Artificial Intelligence, 39(6), 6497–6505. https://doi.org/10.1609/aaai.v39i6.32696

Issue

Section

AAAI Technical Track on Computer Vision V