Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval

Authors

  • Jinpeng Wang Tsinghua Shenzhen International Graduate School, Tsinghua University School of Computer Science and Engineering, Sun Yat-sen University
  • Bin Chen Tsinghua Shenzhen International Graduate School, Tsinghua University
  • Qiang Zhang University College London
  • Zaiqiao Meng University of Cambridge
  • Shangsong Liang School of Computer Science and Engineering, Sun Yat-sen University
  • Shutao Xia Tsinghua Shenzhen International Graduate School, Tsinghua University

Keywords:

Image and Video Retrieval

Abstract

Deep quantization methods have shown high efficiency on large-scale image retrieval. However, current models heavily rely on ground-truth information, hindering the application of quantization in label-hungry scenarios. A more realistic demand is to learn from inexhaustible uploaded images that are associated with informal tags provided by amateur users. Though such sketchy tags do not obviously reveal the labels, they actually contain useful semantic information for supervising deep quantization. To this end, we propose Weakly-Supervised Deep Hyperspherical Quantization (WSDHQ), which is the first work to learn deep quantization from weakly tagged images. Specifically, 1) we use word embeddings to represent the tags and enhance their semantic information based on a tag correlation graph. 2) To better preserve semantic information in quantization codes and reduce quantization error, we jointly learn semantics-preserving embeddings and supervised quantizer on hypersphere by employing a well-designed fusion layer and tailor-made loss functions. Extensive experiments show that WSDHQ can achieve state-of-art performance in weakly-supervised compact coding.

Downloads

Published

2021-05-18

How to Cite

Wang, J., Chen, B., Zhang, Q., Meng, Z., Liang, S., & Xia, S. (2021). Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 2755-2763. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16380

Issue

Section

AAAI Technical Track on Computer Vision III