The MediaSpin Dataset: Post-Publication News Headline Edits Annotated for Media Bias

Authors

  • Preetika Verma Carnegie Mellon University Department of Communications and New Media, National University of Singapore
  • Kokil Jaidka NUS Centre for Trusted Internet and Community Department of Communications and New Media, National University of Singapore

DOI:

https://doi.org/10.1609/icwsm.v20i1.42794

Abstract

We present MediaSpin, a large-scale language resource capturing how major news outlets modify headlines after publication, and MediaSpin-in-the-Wild, a complementary dataset linking these revised headlines to their downstream engagement on social media. The increasing editability of online news headlines offers new opportunities to study linguistic framing and bias through the lens of editorial revisions. The dataset contains 78,910 headline pairs annotated for 13 types of media bias, grounded in established media-bias taxonomies, covering both subjective (e.g., sensationalism, spin) and objective (e.g., omission, slant) forms, with annotation conducted through a human-supervised large-language-model pipeline with expert validation and quality control. We describe the annotation schema and demonstrate three downstream applications: (1) cross-national analysis of how country references are added or removed during editing, (2) transformer-based bias classification at both binary and fine-grained levels, and (3) behavioral analysis of biased headlines on X (Twitter) using 180,786 news-related tweets from 819 consenting users. The results reveal regional asymmetries in representational framing, measurable linguistic markers, and consistently higher engagement with biased content. MediaSpin and MediaSpin-in-the-Wild together provide a reproducible benchmark for bias detection and the study of editorial and behavioral dynamics in contemporary media ecosystems.

Downloads

Published

2026-05-25

How to Cite

Verma, P., & Jaidka, K. (2026). The MediaSpin Dataset: Post-Publication News Headline Edits Annotated for Media Bias. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 2949–2962. https://doi.org/10.1609/icwsm.v20i1.42794