Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation

Authors

  • Zhuohang Dang School of Computer Science and Technology, MOEKLINNS Laboratory, Xi'an Jiaotong University
  • Minnan Luo School of Computer Science and Technology, MOEKLINNS Laboratory, Xi'an Jiaotong University
  • Chengyou Jia School of Computer Science and Technology, MOEKLINNS Laboratory, Xi'an Jiaotong University
  • Guang Dai SGIT AI Lab State Grid Corporation of China
  • Xiaojun Chang University of Technology Sydney Mohamed bin Zayed University of Artificial Intelligence
  • Jingdong Wang Baidu

DOI:

https://doi.org/10.1609/aaai.v38i2.27911

Keywords:

CV: Language and Vision, CV: Applications, CV: Image and Video Retrieval

Abstract

Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. Recently, to alleviate expensive data collection, co-occurring pairs from the Internet are automatically harvested for training. However, it inevitably includes mismatched pairs, i.e., noisy correspondences, undermining supervision reliability and degrading performance. Current methods leverage deep neural networks' memorization effect to address noisy correspondences, which overconfidently focus on similarity-guided training with hard negatives and suffer from self-reinforcing errors. In light of above, we introduce a novel noisy correspondence learning framework, namely Self-Reinforcing Errors Mitigation (SREM). Specifically, by viewing sample matching as classification tasks within the batch, we generate classification logits for the given sample. Instead of a single similarity score, we refine sample filtration through energy uncertainty and estimate model's sensitivity of selected clean samples using swapped classification entropy, in view of the overall prediction distribution. Additionally, we propose cross-modal biased complementary learning to leverage negative matches overlooked in hard-negative training, further improving model optimization stability and curbing self-reinforcing errors. Extensive experiments on challenging benchmarks affirm the efficacy and efficiency of SREM.

Published

2024-03-24

How to Cite

Dang, Z., Luo, M., Jia, C., Dai, G., Chang, X., & Wang, J. (2024). Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1463-1471. https://doi.org/10.1609/aaai.v38i2.27911

Issue

Section

AAAI Technical Track on Computer Vision I