Fine-Grained Recognition: Accounting for Subtle Differences between Similar Classes
The main requisite for fine-grained recognition task is to focus on subtle discriminative details that make the subordinate classes different from each other. We note that existing methods implicitly address this requirement and leave it to a data-driven pipeline to figure out what makes a subordinate class different from the others. This results in two major limitations: First, the network focuses on the most obvious distinctions between classes and overlooks more subtle inter-class variations. Second, the chance of misclassifying a given sample in any of the negative classes is considered equal, while in fact, confusions generally occur among only the most similar classes. Here, we propose to explicitly force the network to find the subtle differences among closely related classes. In this pursuit, we introduce two key novelties that can be easily plugged into existing end-to-end deep learning pipelines. On one hand, we introduce “diversification block” which masks the most salient features for an input to force the network to use more subtle cues for its correct classification. Concurrently, we introduce a “gradient-boosting” loss function that focuses only on the confusing classes for each sample and therefore moves swiftly along the direction on the loss surface that seeks to resolve these ambiguities. The synergy between these two blocks helps the network to learn more effective feature representations. Comprehensive experiments are performed on five challenging datasets. Our approach outperforms existing methods using similar experimental setting on all five datasets.