Spotting the Unseen: Reciprocal Consensus Network Guided by Visual Archetypes

Wenbo Hu; Hongjian Zhan; Xinchen Ma; Yue Lu; Ching Y. Suen

doi:10.1609/aaai.v38i11.29149

Authors

Wenbo Hu Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University Centre for Pattern Recognition and Machine Intelligence, Concordia University
Hongjian Zhan Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University
Xinchen Ma Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University
Yue Lu Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University
Ching Y. Suen Centre for Pattern Recognition and Machine Intelligence, Concordia University

DOI:

https://doi.org/10.1609/aaai.v38i11.29149

Keywords:

ML: Deep Learning Algorithms, APP: Humanities & Computational Social Science, CV: Applications, CV: Object Detection & Categorization

Abstract

Humans often require only a few visual archetypes to spot novel objects. Based on this observation, we present a strategy rooted in ``spotting the unseen" by establishing dense correspondences between potential query image regions and a visual archetype, and we propose the Consensus Network (CoNet). Our method leverages relational patterns intra and inter images via Auto-Correlation Representation (ACR) and Mutual-Correlation Representation (MCR). Within each image, the ACR module is capable of encoding both local self-similarity and global context simultaneously. Between the query and support images, the MCR module computes the cross-correlation across two image representations and introduces a reciprocal consistency constraint, which can incorporate to exclude outliers and enhance model robustness. To overcome the challenges of low-resource training data, particularly in one-shot learning scenarios, we incorporate an adaptive margin strategy to better handle diverse instances. The experimental results indicate the effectiveness of the proposed method across diverse domains such as object detection in natural scenes, and text spotting in both historical manuscripts and natural scenes, which demonstrates its sparkling generalization ability. Our code is available at: https://github.com/infinite-hwb/conet.

Spotting the Unseen: Reciprocal Consensus Network Guided by Visual Archetypes

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information