Contrastive Instruction Fine-Tuning Large Multimodal Model for Hateful Meme Classification

Ming Shan Hee; Zihan Gao; Yinglong Wang; Xiangxiang Chu; Roy Ka-Wei Lee; Zengchang Qin

doi:10.1609/icwsm.v19i1.35844

Contrastive Instruction Fine-Tuning Large Multimodal Model for Hateful Meme Classification

Authors

Ming Shan Hee Singapore University of Technology and Design
Zihan Gao Beihang University
Yinglong Wang Meituan
Xiangxiang Chu Meituan
Roy Ka-Wei Lee Singapore University of Technology and Design
Zengchang Qin Beihang University

DOI:

https://doi.org/10.1609/icwsm.v19i1.35844

Abstract

Detecting hateful memes requires a model that possesses extensive background knowledge and robust reasoning abilities, especially when the memes contain ambiguous descriptions. Previous research has used large language models (LLMs) and large multimodal models (LMMs) to interpret and categorize these memes. However, distinguishing subtly different hateful and non-hateful memes is still challenging. In recognition of this, our study introduces a unique contrastive instruction fine-tuning approach, InstructMemeCL. This method improves an LMM's ability to discern between memes that have similar visual or textual elements by intensifying its focus on semantic subtleties that separate hateful from non-hateful content. We evaluated our model using AUROC and accuracy metrics on three publicly available hateful meme datasets. The results indicate that our improved LMM more accurately identifies hateful and non-hateful memes, demonstrating superior performance compared to conventional LLMs and LMMs used in similar tasks.

Downloads

Published

2025-06-07

How to Cite

Hee, M. S., Gao, Z., Wang, Y., Chu, X., Lee, R. K.-W., & Qin, Z. (2025). Contrastive Instruction Fine-Tuning Large Multimodal Model for Hateful Meme Classification. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 760–773. https://doi.org/10.1609/icwsm.v19i1.35844

Download Citation

Issue

Vol. 19 (2025): Proceedings of the Nineteenth International AAAI Conference on Web and Social Media

Section

Full Papers

Contrastive Instruction Fine-Tuning Large Multimodal Model for Hateful Meme Classification

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information