Mutual-Enhanced Incongruity Learning Network for Multi-Modal Sarcasm Detection

Yang Qiao; Liqiang Jing; Xuemeng Song; Xiaolin Chen; Lei Zhu; Liqiang Nie

doi:10.1609/aaai.v37i8.26138

Authors

Yang Qiao Shandong University
Liqiang Jing Shandong University
Xuemeng Song Shandong University
Xiaolin Chen Shandong University
Lei Zhu Shandong Normal Unversity
Liqiang Nie Harbin Institute of Technology (Shenzhen)

DOI:

https://doi.org/10.1609/aaai.v37i8.26138

Keywords:

ML: Multimodal Learning, SNLP: Sentiment Analysis and Stylistic Analysis

Abstract

Sarcasm is a sophisticated linguistic phenomenon that is prevalent on today's social media platforms. Multi-modal sarcasm detection aims to identify whether a given sample with multi-modal information (i.e., text and image) is sarcastic. This task's key lies in capturing both inter- and intra-modal incongruities within the same context. Although existing methods have achieved compelling success, they are disturbed by irrelevant information extracted from the whole image and text, or overlooking some important information due to the incomplete input. To address these limitations, we propose a Mutual-enhanced Incongruity Learning Network for multi-modal sarcasm detection, named MILNet. In particular, we design a local semantic-guided incongruity learning module and a global incongruity learning module. Moreover, we introduce a mutual enhancement module to take advantage of the underlying consistency between the two modules to boost the performance. Extensive experiments on a widely-used dataset demonstrate the superiority of our model over cutting-edge methods.

Mutual-Enhanced Incongruity Learning Network for Multi-Modal Sarcasm Detection

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription