TY - JOUR AU - Liu, Chuanbin AU - Xie, Hongtao AU - Zha, Zheng-Jun AU - Ma, Lingfeng AU - Yu, Lingyun AU - Zhang, Yongdong PY - 2020/04/03 Y2 - 2024/03/28 TI - Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization JF - Proceedings of the AAAI Conference on Artificial Intelligence JA - AAAI VL - 34 IS - 07 SE - AAAI Technical Track: Vision DO - 10.1609/aaai.v34i07.6822 UR - https://ojs.aaai.org/index.php/AAAI/article/view/6822 SP - 11555-11562 AB - <p>Delicate attention of the discriminative regions plays a critical role in Fine-Grained Visual Categorization (FGVC). Unfortunately, most of the existing attention models perform poorly in FGVC, due to the pivotal limitations in <em>discriminative regions proposing</em> and <em>region-based feature learning</em>. 1) The discriminative regions are predominantly located based on the filter responses over the images, which can not be directly optimized with a performance metric. 2) Existing methods train the region-based feature extractor as a one-hot classification task individually, while neglecting the knowledge from the entire object. To address the above issues, in this paper, we propose a novel <em>“Filtration and Distillation Learning” (FDL)</em> model to enhance the region attention of discriminate parts for FGVC. Firstly, a <em>Filtration Learning (FL)</em> method is put forward for discriminative part regions proposing based on the matchability between proposing and predicting. Specifically, we utilize the proposing-predicting matchability as the performance metric of Region Proposal Network (RPN), thus enable a direct optimization of RPN to filtrate most discriminative regions. Go in detail, the object-based feature learning and region-based feature learning are formulated as “teacher” and “student”, which can furnish better supervision for region-based feature learning. Accordingly, our FDL can enhance the region attention effectively, and the overall framework can be trained end-to-end without neither object nor parts annotations. Extensive experiments verify that FDL yields state-of-the-art performance under the same backbone with the most competitive approaches on several FGVC tasks.</p> ER -