Multi-View Domain Adaptive Object Detection on Camera Networks

Authors

  • Yan Lu New York University
  • Zhun Zhong University of Trento
  • Yuanchao Shu Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v37i7.26077

Keywords:

ML: Transfer, Domain Adaptation, Multi-Task Learning, CV: Applications, CV: Object Detection & Categorization, ML: Unsupervised & Self-Supervised Learning

Abstract

In this paper, we study a new domain adaptation setting on camera networks, namely Multi-View Domain Adaptive Object Detection (MVDA-OD), in which labeled source data is unavailable in the target adaptation process and target data is captured from multiple overlapping cameras. In such a challenging context, existing methods including adversarial training and self-training fall short due to multi-domain data shift and the lack of source data. To tackle this problem, we propose a novel training framework consisting of two stages. First, we pre-train the backbone using self-supervised learning, in which a multi-view association is developed to construct an effective pretext task. Second, we fine-tune the detection head using robust self-training, where a tracking-based single-view augmentation is introduced to achieve weak-hard consistency learning. By doing so, an object detection model can take advantage of informative samples generated by multi-view association and single-view augmentation to learn discriminative backbones as well as robust detection classifiers. Experiments on two real-world multi-camera datasets demonstrate significant advantages of our approach over the state-of-the-art domain adaptive object detection methods.

Downloads

Published

2023-06-26

How to Cite

Lu, Y., Zhong, Z., & Shu, Y. (2023). Multi-View Domain Adaptive Object Detection on Camera Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 8966-8974. https://doi.org/10.1609/aaai.v37i7.26077

Issue

Section

AAAI Technical Track on Machine Learning II