FD3D: Exploiting Foreground Depth Map for Feature-Supervised Monocular 3D Object Detection

Zizhang Wu; Yuanzhu Gan; Yunzhe Wu; Ruihao Wang; Xiaoquan Wang; Jian Pu

doi:10.1609/aaai.v38i6.28436

Authors

Zizhang Wu Fudan University
Yuanzhu Gan ZongmuTech
Yunzhe Wu ZongmuTech
Ruihao Wang ZongmuTech
Xiaoquan Wang ExploAI
Jian Pu Fudan University

DOI:

https://doi.org/10.1609/aaai.v38i6.28436

Keywords:

CV: 3D Computer Vision, CV: Object Detection & Categorization

Abstract

Monocular 3D object detection usually adopts direct or hierarchical label supervision. Recently, the distillation supervision transfers the spatial knowledge from LiDAR- or stereo-based teacher networks to monocular detectors, but remaining the domain gap. To mitigate this issue and pursue adequate label manipulation, we exploit Foreground Depth map for feature-supervised monocular 3D object detection named FD3D, which develops the high-quality instructive intermediate features to conduct desirable auxiliary feature supervision with only the original image and annotation foreground object-wise depth map (AFOD) as input. Furthermore, we build up our instructive feature generation network to create instructive spatial features based on the sufficient correlation between image features and pre-processed AFOD, where AFOD provides the attention focus only on foreground objects to achieve clearer guidance in the detection task. Moreover, we apply the auxiliary feature supervision from the pixel and distribution level to achieve comprehensive spatial knowledge guidance. Extensive experiments demonstrate that our method achieves state-of-the-art performance on both the KITTI and nuScenes datasets, with no external data and no extra inference computational cost. We also conduct quantitative and qualitative studies to reveal the effectiveness of our designs.

FD3D: Exploiting Foreground Depth Map for Feature-Supervised Monocular 3D Object Detection

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information