Category-Specific Nuance Exploration Network for Fine-Grained Object Retrieval

Authors

  • Shijie Wang International School of Information Science and Engineering, Dalian University of Technology, China
  • Zhihui Wang International School of Information Science and Engineering, Dalian University of Technology, China Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, China
  • Haojie Li International School of Information Science and Engineering, Dalian University of Technology, China Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, China
  • Wanli Ouyang Sense Time Computer Vision Research Group, The University of Sydney, Australia

DOI:

https://doi.org/10.1609/aaai.v36i3.20152

Keywords:

Computer Vision (CV)

Abstract

Employing additional prior knowledge to model local features as a final fine-grained object representation has become a trend for fine-grained object retrieval (FGOR). A potential limitation of these methods is that they only focus on common parts across the dataset (e.g. head, body or even leg) by introducing additional prior knowledge, but the retrieval of a fine-grained object may rely on category-specific nuances that contribute to category prediction. To handle this limitation, we propose an end-to-end Category-specific Nuance Exploration Network (CNENet) that elaborately discovers category-specific nuances that contribute to category prediction, and semantically aligns these nuances grouped by subcategory without any additional prior knowledge, to directly emphasize the discrepancy among subcategories. Specifically, we design a Nuance Modelling Module that adaptively predicts a group of category-specific response (CARE) maps via implicitly digging into category-specific nuances, specifying the locations and scales for category-specific nuances. Upon this, two nuance regularizations are proposed: 1) semantic discrete loss that forces each CARE map to attend to different spatial regions to capture diverse nuances; 2) semantic alignment loss that constructs a consistent semantic correspondence for each CARE map of the same order with the same subcategory via guaranteeing each instance and its transformed counterpart to be spatially aligned. Moreover, we propose a Nuance Expansion Module, which exploits context appearance information of discovered nuances and refines the prediction of current nuance by its similar neighbors, leading to further improvement on nuance consistency and completeness. Extensive experiments validate that our CNENet consistently yields the best performance under the same settings against most competitive approaches on CUB Birds, Stanford Cars, and FGVC Aircraft datasets.

Downloads

Published

2022-06-28

How to Cite

Wang, S., Wang, Z., Li, H., & Ouyang, W. (2022). Category-Specific Nuance Exploration Network for Fine-Grained Object Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 36(3), 2513-2521. https://doi.org/10.1609/aaai.v36i3.20152

Issue

Section

AAAI Technical Track on Computer Vision III