G^2SAM: Graph-Based Global Semantic Awareness Method for Multimodal Sarcasm Detection

Authors

  • Yiwei Wei Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University China University of Petroleum(Beijing) at Karamay
  • Shaozu Yuan JD AI Research
  • Hengyang Zhou China University of Petroleum(Beijing) at Karamay
  • Longbiao Wang Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University Huiyan Technology (Tianjin) Co., Ltd
  • Zhiling Yan JD AI Research
  • Ruosong Yang JD AI Research
  • Meng Chen Yep AI

DOI:

https://doi.org/10.1609/aaai.v38i8.28766

Keywords:

DMKM: Mining of Visual, Multimedia & Multimodal Data, CV: Language and Vision, CV: Multi-modal Vision, KRR: Applications

Abstract

Multimodal sarcasm detection, aiming to detect the ironic sentiment within multimodal social data, has gained substantial popularity in both the natural language processing and computer vision communities. Recently, graph-based studies by drawing sentimental relations to detect multimodal sarcasm have made notable advancements. However, they have neglected exploiting graph-based global semantic congruity from existing instances to facilitate the prediction, which ultimately hinders the model's performance. In this paper, we introduce a new inference paradigm that leverages global graph-based semantic awareness to handle this task. Firstly, we construct fine-grained multimodal graphs for each instance and integrate them into semantic space to draw graph-based relations. During inference, we leverage global semantic congruity to retrieve k-nearest neighbor instances in semantic space as references for voting on the final prediction. To enhance the semantic correlation of representation in semantic space, we also introduce label-aware graph contrastive learning to further improve the performance. Experimental results demonstrate that our model achieves state-of-the-art (SOTA) performance in multimodal sarcasm detection. The code will be available at https://github.com/upccpu/G2SAM.

Published

2024-03-24

How to Cite

Wei, Y., Yuan, S., Zhou, H., Wang, L., Yan, Z., Yang, R., & Chen, M. (2024). G^2SAM: Graph-Based Global Semantic Awareness Method for Multimodal Sarcasm Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38(8), 9151-9159. https://doi.org/10.1609/aaai.v38i8.28766

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management