Multi-Resolution Monocular Depth Map Fusion by Self-Supervised Gradient-Based Composition

Authors

  • Yaqiao Dai National University of Defense Technology
  • Renjiao Yi National University of Defense Technology
  • Chenyang Zhu National University of Defense Technology
  • Hongjun He National University of Defense Technology
  • Kai Xu National University of Defense Technology

DOI:

https://doi.org/10.1609/aaai.v37i1.25123

Keywords:

CV: 3D Computer Vision, CV: Scene Analysis & Understanding

Abstract

Monocular depth estimation is a challenging problem on which deep neural networks have demonstrated great potential. However, depth maps predicted by existing deep models usually lack fine-grained details due to convolution operations and down-samplings in networks. We find that increasing input resolution is helpful to preserve more local details while the estimation at low resolution is more accurate globally. Therefore, we propose a novel depth map fusion module to combine the advantages of estimations with multi-resolution inputs. Instead of merging the low- and high-resolution estimations equally, we adopt the core idea of Poisson fusion, trying to implant the gradient domain of high-resolution depth into the low-resolution depth. While classic Poisson fusion requires a fusion mask as supervision, we propose a self-supervised framework based on guided image filtering. We demonstrate that this gradient-based composition performs much better at noisy immunity, compared with the state-of-the-art depth map fusion method. Our lightweight depth fusion is one-shot and runs in real-time, making it 80X faster than a state-of-the-art depth fusion method. Quantitative evaluations demonstrate that the proposed method can be integrated into many fully convolutional monocular depth estimation backbones with a significant performance boost, leading to state-of-the-art results of detail enhancement on depth maps. Codes are released at https://github.com/yuinsky/gradient-based-depth-map-fusion.

Downloads

Published

2023-06-26

How to Cite

Dai, Y., Yi, R., Zhu, C., He, H., & Xu, K. (2023). Multi-Resolution Monocular Depth Map Fusion by Self-Supervised Gradient-Based Composition. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 488-496. https://doi.org/10.1609/aaai.v37i1.25123

Issue

Section

AAAI Technical Track on Computer Vision I