Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN


  • Hang Xu Huawei Noah's Ark Lab
  • Linpu Fang South China University of Technology
  • Xiaodan Liang Sun Yat-sen University
  • Wenxiong Kang South China University of Technology
  • Zhenguo Li Huawei Noah's Ark Lab




The dominant object detection approaches treat each dataset separately and fit towards a specific domain, which cannot adapt to other domains without extensive retraining. In this paper, we address the problem of designing a universal object detection model that exploits diverse category granularity from multiple domains and predict all kinds of categories in one system. Existing works treat this problem by integrating multiple detection branches upon one shared backbone network. However, this paradigm overlooks the crucial semantic correlations between multiple domains, such as categories hierarchy, visual similarity, and linguistic relationship. To address these drawbacks, we present a novel universal object detector called Universal-RCNN that incorporates graph transfer learning for propagating relevant semantic information across multiple datasets to reach semantic coherency. Specifically, we first generate a global semantic pool by integrating all high-level semantic representation of all the categories. Then an Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN. Finally, an Inter-Domain Transfer Module is proposed to exploit diverse transfer dependencies across all domains and enhance the regional feature representation by attending and transferring semantic contexts globally. Extensive experiments demonstrate that the proposed method significantly outperforms multiple-branch models and achieves the state-of-the-art results on multiple object detection benchmarks (mAP: 49.1% on COCO).




How to Cite

Xu, H., Fang, L., Liang, X., Kang, W., & Li, Z. (2020). Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12492-12499. https://doi.org/10.1609/aaai.v34i07.6937



AAAI Technical Track: Vision