Less Is Better: Sparse Instance Learning for Cross-Domain Few-Shot Object Detection

Yali Huang; Jie Mei; Ziyi Wu; Yiming Yang; Hongru Zhao; Mingyuan Jiu; Hichem Sahbi

doi:10.1609/aaai.v40i7.37432

Authors

Yali Huang School of Computer and Artificial Intelligence, Zhengzhou University, China
Jie Mei Dongfeng Commercial Vehicle Co.,Ltd, China
Ziyi Wu School of Computer and Artificial Intelligence, Zhengzhou University, China
Yiming Yang School of Computer and Artificial Intelligence, Zhengzhou University, China
Hongru Zhao School of Computer and Artificial Intelligence, Zhengzhou University, China Engineering Research Center of Intelligent Swarm Systems, Ministry of Education, China National SuperComputing Center in Zhengzhou, Zhengzhou, China
Mingyuan Jiu School of Computer and Artificial Intelligence, Zhengzhou University, China Engineering Research Center of Intelligent Swarm Systems, Ministry of Education, China National SuperComputing Center in Zhengzhou, Zhengzhou, China
Hichem Sahbi Sorbonne University, CNRS, LIP6, F-75005, Paris, France

DOI:

https://doi.org/10.1609/aaai.v40i7.37432

Abstract

Cross-Domain Few-Shot Object Detection (CD-FSOD) is an extremely challenging task due to the inherent data scarcity and substantial domain shift between the source and target domains. Existing methods often suffer from overfitting and noisy feature representations, which hinder the construction of discriminative class prototypes in the target domain. In this paper, we propose a novel framework with sparse instance learning (SI-ViTO) for CD-FSOD, which leverages instance sparsity to achieve a better detection with less representation. SI-ViTO adopts a dual-stage sparsity module, consisting of instance feature sparsity not only on the few-shot support images but also on the query images. This dual sparsity enables the model to effectively preserve salient foreground semantics and simultaneously to filter out redundant or noisy information. Furthermore, a new prototype calibration strategy is also used to dynamically refine the class prototypes with query instances to accelerate prototype adaptation. Extensive experimental results on CD-FSOD benchmarks show that SI-ViTO outperforms the state-of-the-art methods, demonstrating that less discriminative representations yield better cross-domain few-shot object detection performance than more abundant ones.

Less Is Better: Sparse Instance Learning for Cross-Domain Few-Shot Object Detection

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information