SGoT-R1: Social Graph of Thought Reasoning-Enhanced Multimodal Large Language Model for Harmful Meme Detection

Authors

  • Xiuxian Wang School of Electrical and Information Engineering, Tianjin University, 300072, China
  • Yuting Su School of Electrical and Information Engineering, Tianjin University, 300072, China
  • Wenhui Li School of Electrical and Information Engineering, Tianjin University, 300072, China
  • Xiaowen Wang School of Electrical and Information Engineering, Tianjin University, 300072, China
  • Zhuojun Li School of Electrical and Information Engineering, Tianjin University, 300072, China
  • Anan Liu School of Electrical and Information Engineering, Tianjin University, 300072, China

DOI:

https://doi.org/10.1609/aaai.v40i31.39868

Abstract

Internet memes serve as widely distributed multimodal social content that conveys complex ideas through metaphorical expressions, often containing harmful implications that make accurate harmful meme detection an important problem. Reasoning knowledge extracted from large language models plays a crucial role in recent advances in harmful meme detection. However, these methods only perform reasoning analysis on memes from a single opinion, ignoring that memes are essentially products of group consensus, where their true meaning interpretation highly depends on the collision and aggregation process of diverse user viewpoints. To address this problem, we propose a Social Graph of Thought Reasoning Enhancement (SGoTRE) framework for harmful meme detection. The SGoTRE contains three key steps: First, through multi-agent simulation technology, we obtain diverse chains of thought that represent the parsing logic of users from different backgrounds toward memes, authentically restoring the diversity characteristics of group cognition. Second, we construct a Social Graph of Thought (SGoT) that effectively integrates multi-chain reasoning processes and structurally expresses the consensus and diversity of viewpoints among users. Finally, we utilize the SGoT for cognitive distillation, internalizing multi-opinion reasoning logic into a single multimodal large model SGoT-R1 to achieve efficient and interpretable harmful meme detection. Experimental results show that SGoT-R1 significantly improves detection performance on mainstream datasets. Particularly on the most challenging FHM dataset, SGoT-R1 achieves an 8.9% improvement over state-of-the-art models.

Downloads

Published

2026-03-14

How to Cite

Wang, X., Su, Y., Li, W., Wang, X., Li, Z., & Liu, A. (2026). SGoT-R1: Social Graph of Thought Reasoning-Enhanced Multimodal Large Language Model for Harmful Meme Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 40(31), 26597–26605. https://doi.org/10.1609/aaai.v40i31.39868

Issue

Section

AAAI Technical Track on Machine Learning VIII