A Telegram Dataset of Propaganda and its Moderation

Klim Kireev; Yevhen Mykhno; Carmela Troncoso; Rebekah Overdorf

doi:10.1609/icwsm.v19i1.35952

Authors

Klim Kireev EPFL, Max Planck Institute for Security and Privacy (MPI-SP)
Yevhen Mykhno Independent reasearcher
Carmela Troncoso EPFL, Max Planck Institute for Security and Privacy (MPI-SP)
Rebekah Overdorf Ruhr University Bochum, Research Center Trustworthy Data Science and Security, University Alliance Ruhr, University of Lausanne

DOI:

https://doi.org/10.1609/icwsm.v19i1.35952

Abstract

Messaging applications like Telegram have evolved into de facto social networking platforms as they add features like broadcast channels and large groups. Yet, research on these aspects of Telegram is sparse compared to more traditional social media platforms. In this paper, we present a dataset of Telegram messages collected using the export API that returns channel histories, complemented by messages collected in real-time. This dual collection methodology allows us to label deleted messages, i.e., messages that are present in the real-time dataset but not the historical dataset. Additionally, we provide labels indicating whether messages have been sent by accounts belonging to one of two distinct propaganda networks. We provide experiments that show how this rich dataset of Telegram messages can be used to study moderation in Telegram, stances and trends on different topics, and to shed light on malicious behaviours present on Telegram. Finally, we outline other use cases where our dataset could help the research community better understand Telegram as a social network.

A Telegram Dataset of Propaganda and its Moderation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information