COSMOS: Catching Out-of-Context Image Misuse Using Self-Supervised Learning

Authors

  • Shivangi Aneja Technical University Of Munich
  • Chris Bregler Google Research
  • Matthias Niessner Technical University of Munich

DOI:

https://doi.org/10.1609/aaai.v37i12.26648

Keywords:

General

Abstract

Despite the recent attention to DeepFakes, one of the most prevalent ways to mislead audiences on social media is the use of unaltered images in a new but false context. We propose a new method that automatically highlights out-of-context image and text pairs, for assisting fact-checkers. Our key insight is to leverage the grounding of images with text to distinguish out-of-context scenarios that cannot be disambiguated with language alone. We propose a self-supervised training strategy where we only need a set of captioned images. At train time, our method learns to selectively align individual objects in an image with textual claims, without explicit supervision. At test time, we check if both captions correspond to the same object(s) in the image but are semantically different, which allows us to make fairly accurate out-of-context predictions. Our method achieves 85% out-of-context detection accuracy. To facilitate benchmarking of this task, we create a large-scale dataset of 200K images with 450K textual captions from a variety of news websites, blogs, and social media posts

Downloads

Published

2023-06-26

How to Cite

Aneja, S., Bregler, C., & Niessner, M. (2023). COSMOS: Catching Out-of-Context Image Misuse Using Self-Supervised Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(12), 14084-14092. https://doi.org/10.1609/aaai.v37i12.26648

Issue

Section

AAAI Special Track on AI for Social Impact