Multi-Platform Aggregated Dataset of Online Communities (MADOC)

Authors

  • Marija Mitrović Dankulov Institute of Physics Belgrade, University of Belgrade
  • Aleksandar Tomašević Faculty of Philosophy, University of Novi Sad
  • Slobodan Maletić Vinča Institute of Nuclear Sciences, University of Belgrade
  • Miroslav Anđelković Vinča Institute of Nuclear Sciences, University of Belgrade
  • Ana Vranić Institute of Physics Belgrade, University of Belgrade
  • Darja Cvetković Institute of Physics Belgrade, University of Belgrade
  • Boris Stupovski Institute of Physics Belgrade, University of Belgrade
  • Dušan Vudragović Institute of Physics Belgrade, University of Belgrade
  • Sara Major Faculty of Philosophy,University of Novi Sad
  • Aleksandar Bogojević Institute of Physics Belgrade, University of Belgrade

DOI:

https://doi.org/10.1609/icwsm.v19i1.35954

Abstract

The Multi-platform Aggregated Dataset of Online Communities (MADOC) is a comprehensive dataset that facilitates computational social science research by providing a unified, standardized dataset for cross-platform analysis of online social dynamics. MADOC aggregates and standardizes data from four distinct platforms: Bluesky, Koo, Reddit, and Voat, spanning from 2012 to 2024. The dataset includes 18.9 million posts, 236 million comments, and data from 23.1 million unique users across all platforms, with a particular focus on understanding community dynamics, user migration patterns, and the evolution of toxic behavior across platforms. By providing standardized data structures and FAIR-compliant access through Zenodo and corresponding Python and R packages, MADOC enables researchers to conduct comparative analyses of user behavior, interaction networks, and content sentiment across diverse social media environments. The unique value of the dataset lies in its cross-platform scope, standardized structure, and rich metadata, making it particularly suitable for studying societal phenomena such as community formation, toxic behavior propagation, and user migration patterns in response to platform moderation policies.

Downloads

Published

2025-06-07

How to Cite

Mitrović Dankulov, M., Tomašević, A., Maletić, S., Anđelković, M., Vranić, A., Cvetković, D., Stupovski, B., Vudragović, D., Major, S., & Bogojević, A. (2025). Multi-Platform Aggregated Dataset of Online Communities (MADOC). Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 2529-2538. https://doi.org/10.1609/icwsm.v19i1.35954