Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization

Authors

  • Chuanbin Liu University of Science and Technology of China
  • Hongtao Xie University of Science and Technology of China
  • Zheng-Jun Zha University of Science and Technology of China
  • Lingfeng Ma University of Science and Technology of China
  • Lingyun Yu University of Science and Technology of China
  • Yongdong Zhang University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v34i07.6822

Abstract

Delicate attention of the discriminative regions plays a critical role in Fine-Grained Visual Categorization (FGVC). Unfortunately, most of the existing attention models perform poorly in FGVC, due to the pivotal limitations in discriminative regions proposing and region-based feature learning. 1) The discriminative regions are predominantly located based on the filter responses over the images, which can not be directly optimized with a performance metric. 2) Existing methods train the region-based feature extractor as a one-hot classification task individually, while neglecting the knowledge from the entire object. To address the above issues, in this paper, we propose a novel “Filtration and Distillation Learning” (FDL) model to enhance the region attention of discriminate parts for FGVC. Firstly, a Filtration Learning (FL) method is put forward for discriminative part regions proposing based on the matchability between proposing and predicting. Specifically, we utilize the proposing-predicting matchability as the performance metric of Region Proposal Network (RPN), thus enable a direct optimization of RPN to filtrate most discriminative regions. Go in detail, the object-based feature learning and region-based feature learning are formulated as “teacher” and “student”, which can furnish better supervision for region-based feature learning. Accordingly, our FDL can enhance the region attention effectively, and the overall framework can be trained end-to-end without neither object nor parts annotations. Extensive experiments verify that FDL yields state-of-the-art performance under the same backbone with the most competitive approaches on several FGVC tasks.

Downloads

Published

2020-04-03

How to Cite

Liu, C., Xie, H., Zha, Z.-J., Ma, L., Yu, L., & Zhang, Y. (2020). Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 11555-11562. https://doi.org/10.1609/aaai.v34i07.6822

Issue

Section

AAAI Technical Track: Vision