Geometry-Guided Domain Generalization for Monocular 3D Object Detection

Fan Yang; Hui Chen; Yuwei He; Sicheng Zhao; Chenghao Zhang; Kai Ni; Guiguang Ding

doi:10.1609/aaai.v38i6.28467

Authors

Fan Yang Tsinghua University BNRist Hangzhou Zhuoxi Institute of Brain and Intelligence
Hui Chen Tsinghua University BNRist
Yuwei He Tsinghua University BNRist
Sicheng Zhao Tsinghua University BNRist
Chenghao Zhang Tsinghua University BNRist
Kai Ni HoloMatic Technology
Guiguang Ding Tsinghua University BNRist

DOI:

https://doi.org/10.1609/aaai.v38i6.28467

Keywords:

CV: 3D Computer Vision, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

Monocular 3D object detection (M3OD) is important for autonomous driving. However, existing deep learning-based methods easily suffer from performance degradation in real-world scenarios due to the substantial domain gap between training and testing. M3OD's domain gaps are complex, including camera intrinsic parameters, extrinsic parameters, image appearance, etc. Existing works primarily focus on the domain gaps of camera intrinsic parameters, ignoring other key factors. Moreover, at the feature level, conventional domain invariant learning methods generally cause the negative transfer issue, due to the ignorance of dependency between geometry tasks and domains. To tackle these issues, in this paper, we propose MonoGDG, a geometry-guided domain generalization framework for M3OD, which effectively addresses the domain gap at both camera and feature levels. Specifically, MonoGDG consists of two major components. One is geometry-based image reprojection, which mitigates the impact of camera discrepancy by unifying intrinsic parameters, randomizing camera orientations, and unifying the field of view range. The other is geometry-dependent feature disentanglement, which overcomes the negative transfer problems by incorporating domain-shared and domain-specific features. Additionally, we leverage a depth-disentangled domain discriminator and a domain-aware geometry regression attention mechanism to account for the geometry-domain dependency. Extensive experiments on multiple autonomous driving benchmarks demonstrate that our method achieves state-of-the-art performance in domain generalization for M3OD.

Geometry-Guided Domain Generalization for Monocular 3D Object Detection

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information