PCSR: Pseudo-label Consistency-Guided Sample Refinement for Noisy Correspondence Learning

Authors

  • Zhuoyao Liu Sichuan University
  • Yang Liu Sichuan University
  • Wentao Feng Sichuan University
  • Shudong Huang Sichuan University Engineering Research Center of Machine Learning and Industry Intelligence

DOI:

https://doi.org/10.1609/aaai.v40i28.39578

Abstract

Cross-modal retrieval aims to align different modalities via semantic similarity. However, existing methods often assume that image-text pairs are perfectly aligned, overlooking Noisy Correspondences in real data. These misaligned pairs misguide similarity learning and degrade retrieval performance. Previous methods often rely on coarse-grained categorizations that simply divide data into clean and noisy samples, overlooking the intrinsic diversity within noisy instances. Moreover, they typically apply uniform training strategies regardless of sample characteristics, resulting in suboptimal sample utilization for model optimization. To address the above challenges, we introduce a novel framework, called Pseudo-label Consistency-Guided Sample Refinement (PCSR), which enhances correspondence reliability by explicitly dividing samples based on pseudo-label consistency. Specifically, we first employ a confidence-based estimation to distinguish clean and noisy pairs, then refine the noisy pairs via pseudo-label consistency to uncover structurally distinct subsets. We further proposed a Pseudo-label Consistency Score (PCS) to quantify prediction stability, enabling the separation of ambiguous and refinable samples within noisy pairs. Accordingly, we adopt Adaptive Pair Optimization (APO), where ambiguous samples are optimized with robust loss functions and refinable ones are enhanced via text replacement during training. Extensive experiments on CC152K, MS-COCO and Flickr30K validate the effectiveness of our method in improving retrieval robustness under noisy supervision.

Downloads

Published

2026-03-14

How to Cite

Liu, Z., Liu, Y., Feng, W., & Huang, S. (2026). PCSR: Pseudo-label Consistency-Guided Sample Refinement for Noisy Correspondence Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(28), 24008–24016. https://doi.org/10.1609/aaai.v40i28.39578

Issue

Section

AAAI Technical Track on Machine Learning V