Dual Decoupling Training for Semi-supervised Object Detection with Noise-Bypass Head

Authors

  • Shida Zheng Hikvision Research Institute
  • Chenshu Chen Hikvision Research Institute
  • Xiaowei Cai Hikvison Research Institute
  • Tingqun Ye Hikvision Research Institute
  • Wenming Tan Hikvision Research Institute

DOI:

https://doi.org/10.1609/aaai.v36i3.20264

Keywords:

Computer Vision (CV)

Abstract

Pseudo bounding boxes from the self-training paradigm are inevitably noisy for semi-supervised object detection. To cope with that, a dual decoupling training framework is proposed in the present study, i.e. clean and noisy data decoupling, and classification and localization task decoupling. In the first decoupling, two-level thresholds are used to categorize pseudo boxes into three groups, i.e. clean backgrounds, noisy foregrounds and clean foregrounds. With a specially designed noise-bypass head focusing on noisy data, backbone networks can extract coarse but diverse information; and meanwhile, an original head learns from clean samples for more precise predictions. In the second decoupling, we take advantage of the two-head structure for better evaluation of localization quality, thus the category label and location of a pseudo box can remain independent of each other during training. The approach of two-level thresholds is also applied to group pseudo boxes into three sections of different location accuracy. We outperform existing works by a large margin on VOC datasets, reaching 54.8 mAP(+1.8), and even up to 55.9 mAP(+1.5) by leveraging MS-COCO train2017 as extra unlabeled data. On MS-COCO benchmark, our method also achieves about 1.0 mAP improvements averaging across protocols compared with the prior state-of-the-art.

Downloads

Published

2022-06-28

How to Cite

Zheng, S., Chen, C., Cai, X., Ye, T., & Tan, W. (2022). Dual Decoupling Training for Semi-supervised Object Detection with Noise-Bypass Head. Proceedings of the AAAI Conference on Artificial Intelligence, 36(3), 3526-3534. https://doi.org/10.1609/aaai.v36i3.20264

Issue

Section

AAAI Technical Track on Computer Vision III