Dual Attention Networks for Few-Shot Fine-Grained Recognition

Authors

  • Shu-Lin Xu School of Computer Science and Engineering, Nanjing University of Science and Technology State Key Laboratory of Integrated Services Networks, Xidian University
  • Faen Zhang AInnovation Technology Group Co., Ltd
  • Xiu-Shen Wei School of Computer Science and Engineering, Nanjing University of Science and Technology State Key Laboratory of Integrated Services Networks, Xidian University State Key Laboratory for Novel Software Technology, Nanjing University
  • Jianhua Wang AInnovation Technology Group Co., Ltd

DOI:

https://doi.org/10.1609/aaai.v36i3.20196

Keywords:

Computer Vision (CV), Machine Learning (ML)

Abstract

The task of few-shot fine-grained recognition is to classify images belonging to subordinate categories merely depending on few examples. Due to the fine-grained nature, it is desirable to capture subtle but discriminative part-level patterns from limited training data, which makes it a challenging problem. In this paper, to generate fine-grained tailored representations for few-shot recognition, we propose a Dual Attention Network (Dual Att-Net) consisting of two dual branches of both hard- and soft-attentions. Specifically, by producing attention guidance from deep activations of input images, our hard-attention is realized by keeping a few useful deep descriptors and forming them as a bag of multi-instance learning. Since these deep descriptors could correspond to objects' parts, the advantage of modeling as a multi-instance bag is able to exploit inherent correlation of these fine-grained parts. On the other side, a soft attended activation representation can be obtained by applying attention guidance upon original activations, which brings comprehensive attention information as the counterpart of hard-attention. After that, both outputs of dual branches are aggregated as a holistic image embedding w.r.t. input images. By performing meta-learning, we can learn a powerful image embedding in such a metric space to generalize to novel classes. Experiments on three popular fine-grained benchmark datasets show that our Dual Att-Net obviously outperforms other existing state-of-the-art methods.

Downloads

Published

2022-06-28

How to Cite

Xu, S.-L., Zhang, F., Wei, X.-S., & Wang, J. (2022). Dual Attention Networks for Few-Shot Fine-Grained Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 36(3), 2911-2919. https://doi.org/10.1609/aaai.v36i3.20196

Issue

Section

AAAI Technical Track on Computer Vision III