FactDrill: A Data Repository of Fact-Checked Social Media Content to Study Fake News Incidents in India

Authors

  • Shivangi Singhal Indraprastha Institute of Information Technology, Delhi, India
  • Rajiv Ratn Shah Indraprastha Institute of Information Technology, Delhi, India
  • Ponnurangam Kumaraguru International Institute of Information Technology, Hyderabad, India

Keywords:

Credibility of online content, Analysis of the relationship between social media and mainstream media, Measuring predictability of real world phenomena based on social media, e.g., spanning politics, finance, and health, Subjectivity in textual data; sentiment analysis; polarity/opinion identification and extraction, linguistic analyses of social media behavior

Abstract

The production and circulation of fake content in India is a rising problem. There is a dire need to investigate the false claims made in public. This paper presents a dataset containing 22,435 fact-checked social media content to study fake news incidents in India. The dataset comprises news stories from 2013 to 2020, covering 13 different languages spoken in the country. We present a detailed description of the 14 different attributes present in the dataset. We also present the detailed characterisation of three M’s (multi-lingual, multi-media, multi-domain) in the FactDrill dataset. Lastly, we present some potential use cases of the dataset. We expect that the dataset will be a valuable resource to understand the dynamics of fake content in a multi-lingual setting in India.

Downloads

Published

2022-05-31

How to Cite

Singhal, S., Shah, R. R., & Kumaraguru, P. (2022). FactDrill: A Data Repository of Fact-Checked Social Media Content to Study Fake News Incidents in India. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 1322-1331. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/19384