Web-Supervised Network with Softly Update-Drop Training for Fine-Grained Visual Classification

Authors

  • Chuanyi Zhang Nanjing University of Science and Technology
  • Yazhou Yao Nanjing University of Science and Technology
  • Huafeng Liu Nanjing University of Science and Technology
  • Guo-Sen Xie Inception Institute of Artificial Intelligence
  • Xiangbo Shu Nanjing University of Science and Technology
  • Tianfei Zhou Inception Institute of Artificial Intelligence
  • Zheng Zhang Harbin Institute of Technology
  • Fumin Shen University of Electronic Science and Technology of China
  • Zhenmin Tang Nanjing University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v34i07.6973

Abstract

Labeling objects at the subordinate level typically requires expert knowledge, which is not always available from a random annotator. Accordingly, learning directly from web images for fine-grained visual classification (FGVC) has attracted broad attention. However, the existence of noise in web images is a huge obstacle for training robust deep neural networks. In this paper, we propose a novel approach to remove irrelevant samples from the real-world web images during training, and only utilize useful images for updating the networks. Thus, our network can alleviate the harmful effects caused by irrelevant noisy web images to achieve better performance. Extensive experiments on three commonly used fine-grained datasets demonstrate that our approach is much superior to state-of-the-art webly supervised methods. The data and source code of this work have been made anonymously available at: https://github.com/z337-408/WSNFGVC.

Downloads

Published

2020-04-03

How to Cite

Zhang, C., Yao, Y., Liu, H., Xie, G.-S., Shu, X., Zhou, T., Zhang, Z., Shen, F., & Tang, Z. (2020). Web-Supervised Network with Softly Update-Drop Training for Fine-Grained Visual Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12781-12788. https://doi.org/10.1609/aaai.v34i07.6973

Issue

Section

AAAI Technical Track: Vision