Contrastive Instruction Fine-Tuning Large Multimodal Model for Hateful Meme Classification

Authors

  • Ming Shan Hee Singapore University of Technology and Design
  • Zihan Gao Beihang University
  • Yinglong Wang Meituan
  • Xiangxiang Chu Meituan
  • Roy Ka-Wei Lee Singapore University of Technology and Design
  • Zengchang Qin Beihang University

DOI:

https://doi.org/10.1609/icwsm.v19i1.35844

Abstract

Detecting hateful memes requires a model that possesses extensive background knowledge and robust reasoning abilities, especially when the memes contain ambiguous descriptions. Previous research has used large language models (LLMs) and large multimodal models (LMMs) to interpret and categorize these memes. However, distinguishing subtly different hateful and non-hateful memes is still challenging. In recognition of this, our study introduces a unique contrastive instruction fine-tuning approach, InstructMemeCL. This method improves an LMM's ability to discern between memes that have similar visual or textual elements by intensifying its focus on semantic subtleties that separate hateful from non-hateful content. We evaluated our model using AUROC and accuracy metrics on three publicly available hateful meme datasets. The results indicate that our improved LMM more accurately identifies hateful and non-hateful memes, demonstrating superior performance compared to conventional LLMs and LMMs used in similar tasks.

Downloads

Published

2025-06-07

How to Cite

Hee, M. S., Gao, Z., Wang, Y., Chu, X., Lee, R. K.-W., & Qin, Z. (2025). Contrastive Instruction Fine-Tuning Large Multimodal Model for Hateful Meme Classification. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 760–773. https://doi.org/10.1609/icwsm.v19i1.35844