Stop-Gradient Softmax Loss for Deep Metric Learning

Authors

  • Lu Yang Northwestern Polytechnical University National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology
  • Peng Wang Northwestern Polytechnical University National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology
  • Yanning Zhang Northwestern Polytechnical University National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology

DOI:

https://doi.org/10.1609/aaai.v37i3.25421

Keywords:

CV: Image and Video Retrieval, CV: Learning & Optimization for CV, CV: Representation Learning for Vision

Abstract

Deep metric learning aims to learn a feature space that models the similarity between images, and feature normalization is a critical step for boosting performance. However directly optimizing L2-normalized softmax loss cause the network to fail to converge. Therefore some SOTA approaches appends a scale layer after the inner product to relieve the convergence problem, but it incurs a new problem that it's difficult to learn the best scaling parameters. In this letter, we look into the characteristic of softmax-based approaches and propose a novel learning objective function Stop-Gradient Softmax Loss (SGSL) to solve the convergence problem in softmax-based deep metric learning with L2-normalization. In addition, we found a useful trick named Remove the last BN-ReLU (RBR). It removes the last BN-ReLU in the backbone to reduce the learning burden of the model. Experimental results on four fine-grained image retrieval benchmarks show that our proposed approach outperforms most existing approaches, i.e., our approach achieves 75.9% on CUB-200-2011, 94.7% on CARS196 and 83.1% on SOP which outperforms other approaches at least 1.7%, 2.9% and 1.7% on Recall@1.

Downloads

Published

2023-06-26

How to Cite

Yang, L., Wang, P., & Zhang, Y. (2023). Stop-Gradient Softmax Loss for Deep Metric Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(3), 3164-3172. https://doi.org/10.1609/aaai.v37i3.25421

Issue

Section

AAAI Technical Track on Computer Vision III