Category Dictionary Guided Unsupervised Domain Adaptation for Object Detection

Authors

  • Shuai Li The Hong Kong Polytechnic University, Hong Kong, China
  • Jianqiang Huang Damo Academy, Alibaba Group
  • Xian-Sheng Hua Damo Academy, Alibaba Group
  • Lei Zhang The Hong Kong Polytechnic University, Hong Kong, China

DOI:

https://doi.org/10.1609/aaai.v35i3.16290

Keywords:

Object Detection & Categorization

Abstract

Unsupervised domain adaption (UDA) is a promising solution to enhance the generalization ability of a model from a source domain to a target domain without manually annotating labels for target data. Recent works in cross-domain object detection mostly resort to adversarial feature adaptation to match the marginal distributions of two domains. However, perfect feature alignment is hard to achieve and is likely to cause negative transfer due to the high complexity of object detection. In this paper, we propose a category dictionary guided (CDG) UDA model for cross-domain object detection, which learns category-specific dictionaries from the source domain to represent the candidate boxes in target domain. The representation residual can be used for not only pseudo label assignment but also quality (e.g., IoU) estimation of the candidate box. A residual weighted self-training paradigm is then developed to implicitly align source and target domains for detection model training. Compared with decision boundary based classifiers such as softmax, the proposed CDG scheme can select more informative and reliable pseudo-boxes. Experimental results on benchmark datasets show that the proposed CDG significantly exceeds the state-of-the-arts in cross-domain object detection.

Downloads

Published

2021-05-18

How to Cite

Li, S., Huang, J., Hua, X.-S., & Zhang, L. (2021). Category Dictionary Guided Unsupervised Domain Adaptation for Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3), 1949-1957. https://doi.org/10.1609/aaai.v35i3.16290

Issue

Section

AAAI Technical Track on Computer Vision II