AdaCo: Overcoming Visual Foundation Model Noise in 3D Semantic Segmentation via Adaptive Label Correction

Authors

  • Pufan Zou Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University, China Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, China,
  • Shijia Zhao Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University, China Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, China,
  • Weijie Huang Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University, China Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, China,
  • Qiming Xia Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University, China Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, China,
  • Chenglu Wen Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University, China Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, China,
  • Wei Li Inceptio
  • Cheng Wang Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University, China Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, China,

DOI:

https://doi.org/10.1609/aaai.v39i10.33205

Abstract

Recently, Visual Foundation Models (VFMs) have shown a remarkable generalization performance in 3D perception tasks. However, their effectiveness in large-scale outdoor datasets remains constrained by the scarcity of accurate supervision signals, the extensive noise caused by variable outdoor conditions, and the abundance of unknown objects. In this work, we propose a novel label-free learning method, Adaptive Label Correction (AdaCo), for 3D semantic segmentation. AdaCo first introduces the Cross-modal Label Generation Module (CLGM), providing cross-modal supervision with the formidable interpretive capabilities of the VFMs. Subsequently, AdaCo incorporates the Adaptive Noise Corrector (ANC), updating and adjusting the noisy samples within this supervision iteratively during training. Moreover, we develop an Adaptive Robust Loss (ARL) function to modulate each sample's sensitivity to noisy supervision, preventing potential underfitting issues associated with robust loss. Our proposed AdaCo can effectively mitigate the performance limitations of label-free learning networks in 3D semantic segmentation tasks. Extensive experiments on two outdoor benchmark datasets highlight the superior performance of our method.

Downloads

Published

2025-04-11

How to Cite

Zou, P., Zhao, S., Huang, W., Xia, Q., Wen, C., Li, W., & Wang, C. (2025). AdaCo: Overcoming Visual Foundation Model Noise in 3D Semantic Segmentation via Adaptive Label Correction. Proceedings of the AAAI Conference on Artificial Intelligence, 39(10), 11086–11094. https://doi.org/10.1609/aaai.v39i10.33205

Issue

Section

AAAI Technical Track on Computer Vision IX