Dual Decoupling Training for Semi-supervised Object Detection with Noise-Bypass Head

Shida Zheng; Chenshu Chen; Xiaowei Cai; Tingqun Ye; Wenming Tan

doi:10.1609/aaai.v36i3.20264

Authors

Shida Zheng Hikvision Research Institute
Chenshu Chen Hikvision Research Institute
Xiaowei Cai Hikvison Research Institute
Tingqun Ye Hikvision Research Institute
Wenming Tan Hikvision Research Institute

DOI:

https://doi.org/10.1609/aaai.v36i3.20264

Keywords:

Computer Vision (CV)

Abstract

Pseudo bounding boxes from the self-training paradigm are inevitably noisy for semi-supervised object detection. To cope with that, a dual decoupling training framework is proposed in the present study, i.e. clean and noisy data decoupling, and classification and localization task decoupling. In the first decoupling, two-level thresholds are used to categorize pseudo boxes into three groups, i.e. clean backgrounds, noisy foregrounds and clean foregrounds. With a specially designed noise-bypass head focusing on noisy data, backbone networks can extract coarse but diverse information; and meanwhile, an original head learns from clean samples for more precise predictions. In the second decoupling, we take advantage of the two-head structure for better evaluation of localization quality, thus the category label and location of a pseudo box can remain independent of each other during training. The approach of two-level thresholds is also applied to group pseudo boxes into three sections of different location accuracy. We outperform existing works by a large margin on VOC datasets, reaching 54.8 mAP(+1.8), and even up to 55.9 mAP(+1.5) by leveraging MS-COCO train2017 as extra unlabeled data. On MS-COCO benchmark, our method also achieves about 1.0 mAP improvements averaging across protocols compared with the prior state-of-the-art.

Dual Decoupling Training for Semi-supervised Object Detection with Noise-Bypass Head

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription