Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

Authors

  • Liam Hebert University of Waterloo
  • Gaurav Sahu University of Waterloo
  • Yuxuan Guo University of Waterloo
  • Nanda Kishore Sreenivas University of Waterloo
  • Lukasz Golab University of Waterloo
  • Robin Cohen University of Waterloo

DOI:

https://doi.org/10.1609/aaai.v38i20.30213

Keywords:

General

Abstract

We present the Multi-Modal Discussion Transformer (mDT), a novel method for detecting hate speech on online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies.

Published

2024-03-24

How to Cite

Hebert, L., Sahu, G., Guo, Y., Sreenivas, N. K., Golab, L., & Cohen, R. (2024). Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media. Proceedings of the AAAI Conference on Artificial Intelligence, 38(20), 22096-22104. https://doi.org/10.1609/aaai.v38i20.30213