Deep Metric Learning with Self-Supervised Ranking

Authors

  • Zheren Fu University of Science and Technology of China
  • Yan Li Kuaishou Technology
  • Zhendong Mao University of Science and Technology of China
  • Quan Wang University of Science and Technology of China
  • Yongdong Zhang University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v35i2.16226

Keywords:

Image and Video Retrieval, Unsupervised & Self-Supervised Learning

Abstract

Deep metric learning aims to learn a deep embedding space, where similar objects are pushed towards together and different objects are repelled against. Existing approaches typically use inter-class characteristics, e.g. class-level information or instance-level similarity, to obtain semantic relevance of data points and get a large margin between different classes in the embedding space. However, the intra-class characteristics, e.g. local manifold structure or relative relationship within the same class, are usually overlooked in the learning process. Hence the data structure cannot be fully exploited and the output embeddings have limitation in retrieval. More importantly, retrieval results lack in a good ranking. This paper presents a novel self-supervised ranking auxiliary framework, which captures intra-class characteristics as well as inter-class characteristics for better metric learning. Our method defines specific transform functions to simulates the local structure change of intra-class in the initial image domain, and formulates a self-supervised learning procedure to fully exploit this property and preserve it in the embedding space. Extensive experiments on three standard benchmarks show that our method significantly improves and outperforms the state-of-the-art methods on the performances of both retrieval and ranking by 2%-4%.

Downloads

Published

2021-05-18

How to Cite

Fu, Z., Li, Y., Mao, Z., Wang, Q., & Zhang, Y. (2021). Deep Metric Learning with Self-Supervised Ranking. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 1370-1378. https://doi.org/10.1609/aaai.v35i2.16226

Issue

Section

AAAI Technical Track on Computer Vision I