3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving

Boyi Sun; Yuhang Liu; Xingxia Wang; Bin Tian; Long Chen; Fei-Yue Wang

doi:10.1609/aaai.v39i7.32760

Authors

Boyi Sun Institute of Automation, Chinese Academy of Science Zhongke JingYu Sensing Technology Co., Ltd
Yuhang Liu Institute of Automation, Chinese Academy of Science Zhongke JingYu Sensing Technology Co., Ltd
Xingxia Wang Institute of Automation, Chinese Academy of Science
Bin Tian Institute of Automation, Chinese Academy of Science Waytous
Long Chen Institute of Automation, Chinese Academy of Science Waytous
Fei-Yue Wang Institute of Automation, Chinese Academy of Science

DOI:

https://doi.org/10.1609/aaai.v39i7.32760

Abstract

Point cloud data labeling is considered a time-consuming and expensive task in autonomous driving, whereas annotation-free learning training can avoid it by learning point cloud representations from unannotated data. In this paper, we propose AFOV, a novel 3D Annotation-Free framework assisted by 2D Open-Vocabulary segmentation models. It consists of two stages: In the first stage, we innovatively integrate high-quality textual and image features of 2D open-vocabulary models and propose the Tri-Modal contrastive Pre-training (TMP). In the second stage, spatial mapping between point clouds and images is utilized to generate pseudo-labels, enabling cross-modal knowledge distillation. Besides, we introduce the Approximate Flat Interaction (AFI) to address the noise during alignment and label confusion. To validate the superiority of AFOV, extensive experiments are conducted on multiple related datasets. We achieved a record-breaking 47.73% mIoU on the annotation-free 3D segmentation task in nuScenes, surpassing the previous best model by 3.13% mIoU. Meanwhile, the performance of fine-tuning with 1% data on nuScenes and SemanticKITTI reached a remarkable 51.75% mIoU and 48.14% mIoU, outperforming all previous pre-trained models.

3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information