Synthetic Depth Transfer for Monocular 3D Object Pose Estimation in the Wild

Yueying Kao; Weiming Li; Qiang Wang; Zhouchen Lin; Wooshik Kim; Sunghoon Hong

doi:10.1609/aaai.v34i07.6781

Authors

Yueying Kao Samsung Research China - Beijing (SRC-B)
Weiming Li Samsung Research China - Beijing (SRC-B)
Qiang Wang Samsung Research China - Beijing (SRC-B)
Zhouchen Lin Peking University
Wooshik Kim Samsung Advanced Institute of Technology (SAIT)
Sunghoon Hong Samsung Advanced Institute of Technology (SAIT)

DOI:

https://doi.org/10.1609/aaai.v34i07.6781

Abstract

Monocular object pose estimation is an important yet challenging computer vision problem. Depth features can provide useful information for pose estimation. However, existing methods rely on real depth images to extract depth features, leading to its difficulty on various applications. In this paper, we aim at extracting RGB and depth features from a single RGB image with the help of synthetic RGB-depth image pairs for object pose estimation. Specifically, a deep convolutional neural network is proposed with an RGB-to-Depth Embedding module and a Synthetic-Real Adaptation module. The embedding module is trained with synthetic pair data to learn a depth-oriented embedding space between RGB and depth images optimized for object pose estimation. The adaptation module is to further align distributions from synthetic to real data. Compared to existing methods, our method does not need any real depth images and can be trained easily with large-scale synthetic data. Extensive experiments and comparisons show that our method achieves best performance on a challenging public PASCAL 3D+ dataset in all the metrics, which substantiates the superiority of our method and the above modules.

Synthetic Depth Transfer for Monocular 3D Object Pose Estimation in the Wild

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information