BelElect: A New Dataset for Bias Research from a “Dark” Platform

Authors

  • Sviatlana Höhn University of Luxembourg
  • Sjouke Mauw University of Luxembourg
  • Nicholas Asher IRIT

Keywords:

Credibility of online content, Qualitative and quantitative studies of social media

Abstract

New social networks and platforms such as Telegram, Gab and Parler offer a stage for extremist, racist and aggressive content, but also provide a safe space for freedom fighters in authoritarian regimes. Data from such platforms offer excellent opportunities for research on issues such as linguistic bias and toxic language detection. However, only a few, mostly unannotated, English-only corpora from such platforms exist. This article presents a new Telegram corpus in Russian and Belorussian languages tailored for research on linguistic bias in political news. In addition, we created a repository to make all currently available corpora from so-called "dark" platforms accessible in one place.

Downloads

Published

2022-05-31

How to Cite

Höhn, S., Mauw, S., & Asher, N. (2022). BelElect: A New Dataset for Bias Research from a “Dark” Platform. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 1268-1274. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/19378