Contrastive Quantization with Code Memory for Unsupervised Image Retrieval

Authors

  • Jinpeng Wang Tsinghua University Harbin Institute of Technology, Shenzhen Research Center of Artificial Intelligence, Peng Cheng Laboratory
  • Ziyun Zeng Tsinghua University Research Center of Artificial Intelligence, Peng Cheng Laboratory
  • Bin Chen Harbin Institute of Technology, Shenzhen
  • Tao Dai Shenzhen University
  • Shu-Tao Xia Tsinghua University Research Center of Artificial Intelligence, Peng Cheng Laboratory

DOI:

https://doi.org/10.1609/aaai.v36i3.20147

Keywords:

Computer Vision (CV), Data Mining & Knowledge Management (DMKM), Machine Learning (ML)

Abstract

The high efficiency in computation and storage makes hashing (including binary hashing and quantization) a common strategy in large-scale retrieval systems. To alleviate the reliance on expensive annotations, unsupervised deep hashing becomes an important research problem. This paper provides a novel solution to unsupervised deep quantization, namely Contrastive Quantization with Code Memory (MeCoQ). Different from existing reconstruction-based strategies, we learn unsupervised binary descriptors by contrastive learning, which can better capture discriminative visual semantics. Besides, we uncover that codeword diversity regularization is critical to prevent contrastive learning-based quantization from model degeneration. Moreover, we introduce a novel quantization code memory module that boosts contrastive learning with lower feature drift than conventional feature memories. Extensive experiments on benchmark datasets show that MeCoQ outperforms state-of-the-art methods. Code and configurations are publicly released.

Downloads

Published

2022-06-28

How to Cite

Wang, J., Zeng, Z., Chen, B., Dai, T., & Xia, S.-T. (2022). Contrastive Quantization with Code Memory for Unsupervised Image Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 36(3), 2468-2476. https://doi.org/10.1609/aaai.v36i3.20147

Issue

Section

AAAI Technical Track on Computer Vision III