CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection

Gyusam Chang; Wonseok Roh; Sujin Jang; Dongwook Lee; Daehyun Ji; Gyeongrok Oh; Jinsun Park; Jinkyu Kim; Sangpil Kim

doi:10.1609/aaai.v38i2.27857

Authors

Gyusam Chang Department of Artificial Intelligence, Korea University
Wonseok Roh Department of Artificial Intelligence, Korea University
Sujin Jang Samsung Advanced Institute of Technology (SAIT)
Dongwook Lee Samsung Advanced Institute of Technology (SAIT)
Daehyun Ji Samsung Advanced Institute of Technology (SAIT)
Gyeongrok Oh Department of Artificial Intelligence, Korea University
Jinsun Park School of Computer Science and Engineering, Pusan National University
Jinkyu Kim Department of Computer Science and Engineering, Korea University
Sangpil Kim Department of Artificial Intelligence, Korea University

DOI:

https://doi.org/10.1609/aaai.v38i2.27857

Keywords:

CV: 3D Computer Vision, CV: Vision for Robotics & Autonomous Driving, CV: Object Detection & Categorization, CV: Multi-modal Vision, ML: Adversarial Learning & Robustness, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

Recent LiDAR-based 3D Object Detection (3DOD) methods show promising results, but they often do not generalize well to target domains outside the source (or training) data distribution. To reduce such domain gaps and thus to make 3DOD models more generalizable, we introduce a novel unsupervised domain adaptation (UDA) method, called CMDA, which (i) leverages visual semantic cues from an image modality (i.e., camera images) as an effective semantic bridge to close the domain gap in the cross-modal Bird's Eye View (BEV) representations. Further, (ii) we also introduce a self-training-based learning strategy, wherein a model is adversarially trained to generate domain-invariant features, which disrupt the discrimination of whether a feature instance comes from a source or an unseen target domain. Overall, our CMDA framework guides the 3DOD model to generate highly informative and domain-adaptive features for novel data distributions. In our extensive experiments with large-scale benchmarks, such as nuScenes, Waymo, and KITTI, those mentioned above provide significant performance gains for UDA tasks, achieving state-of-the-art performance.

CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information