Adversarial Zero-shot Learning With Semantic Augmentation


  • Bin Tong R&D Group, Hitachi
  • Martin Klinkigt R&D Group, Hitachi
  • Junwen Chen R&D Group, Hitachi
  • Xiankun Cui R&D Group, Hitachi
  • Quan Kong R&D Group, Hitachi
  • Tomokazu Murakami R&D Group, Hitachi
  • Yoshiyuki Kobayashi R&D Group, Hitachi


zero-shot learning, generative adversarial network, image classification, image retrieval


In situations in which labels are expensive or difficult to obtain, deep neural networks for object recognition often suffer to achieve fair performance. Zero-shot learning is dedicated to this problem. It aims to recognize objects of unseen classes by transferring knowledge from seen classes via a shared intermediate representation. Using the manifold structure of seen training samples is widely regarded as important to learn a robust mapping between samples and the intermediate representation, which is crucial for transferring the knowledge. However, their irregular structures, such as the lack in variation of samples for certain classes and highly overlapping clusters of different classes, may result in an inappropriate mapping. Additionally, in a high dimensional mapping space, the hubness problem may arise, in which one of the unseen classes has a high possibility to be assigned to samples of different classes. To mitigate such problems, we use a generative adversarial network to synthesize samples with specified semantics to cover a higher diversity of given classes and interpolated semantics of pairs of classes. We propose a simple yet effective method for applying the augmented semantics to the hinge loss functions to learn a robust mapping. The proposed method was extensively evaluated on small- and large-scale datasets, showing a significant improvement over state-of-the-art methods.




How to Cite

Tong, B., Klinkigt, M., Chen, J., Cui, X., Kong, Q., Murakami, T., & Kobayashi, Y. (2018). Adversarial Zero-shot Learning With Semantic Augmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). Retrieved from



Main Track: Machine Learning Applications