BelElect: A New Dataset for Bias Research from a “Dark” Platform

Sviatlana Höhn; Sjouke Mauw; Nicholas Asher

doi:10.1609/icwsm.v16i1.19378

BelElect: A New Dataset for Bias Research from a “Dark” Platform

Authors

Sviatlana Höhn University of Luxembourg
Sjouke Mauw University of Luxembourg
Nicholas Asher IRIT

DOI:

https://doi.org/10.1609/icwsm.v16i1.19378

Keywords:

Credibility of online content, Qualitative and quantitative studies of social media

Abstract

New social networks and platforms such as Telegram, Gab and Parler offer a stage for extremist, racist and aggressive content, but also provide a safe space for freedom fighters in authoritarian regimes. Data from such platforms offer excellent opportunities for research on issues such as linguistic bias and toxic language detection. However, only a few, mostly unannotated, English-only corpora from such platforms exist. This article presents a new Telegram corpus in Russian and Belorussian languages tailored for research on linguistic bias in political news. In addition, we created a repository to make all currently available corpora from so-called "dark" platforms accessible in one place.

Downloads

Published

2022-05-31

How to Cite

Höhn, S., Mauw, S., & Asher, N. (2022). BelElect: A New Dataset for Bias Research from a “Dark” Platform. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 1268-1274. https://doi.org/10.1609/icwsm.v16i1.19378

Download Citation

Issue

Vol. 16 (2022): Proceedings of the Sixteenth International AAAI Conference on Web and Social Media

Section

Dataset Papers

BelElect: A New Dataset for Bias Research from a “Dark” Platform

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information