SPP-SCL: Semi-Push-Pull Supervised Contrastive Learning for Image-Text Sentiment Analysis and Beyond

Authors

  • Jiesheng Wu Anhui Normal University
  • Shengrong Li Nanjing University of Aeronautics and Astronautics

DOI:

https://doi.org/10.1609/aaai.v40i3.37200

Abstract

Existing Image-Text Sentiment Analysis (ITSA) methods may suffer from inconsistent intra-modal and inter-modal sentiment relationships. Therefore, we develop a method that balances before fusing to solve the issue of vision-language imbalance intra-modal and inter-modal sentiment relationships; that is, a Semi-Push-Pull Supervised Contrastive Learning (SPP-SCL) method is proposed. Specifically, the method is implemented using a novel two-step strategy, namely first using the proposed intra-modal supervised contrastive learning to pull the relationships between the intra-modal and then performing a well-designed conditional execution statement. If the statement result is false, our method will perform the second step, which is inter-modal supervised contrastive learning to push away the relationships between inter-modal. The two-step strategy will balance the intra-modal and inter-modal relationships to achieve the purpose of relationship consistency and finally perform cross-modal feature fusion for sentiment analysis and detection. Experimental studies on three public image-text sentiment and sarcasm detection datasets demonstrate that SPP-SCL significantly outperforms state-of-the-art methods by a large margin and is more discriminative in sentiment.

Downloads

Published

2026-03-14

How to Cite

Wu, J., & Li, S. (2026). SPP-SCL: Semi-Push-Pull Supervised Contrastive Learning for Image-Text Sentiment Analysis and Beyond. Proceedings of the AAAI Conference on Artificial Intelligence, 40(3), 2173-2181. https://doi.org/10.1609/aaai.v40i3.37200

Issue

Section

AAAI Technical Track on Cognitive Modeling & Cognitive Systems