NewsDB: An Automated Approach to Build an Extensive Database of Self-Proclaimed News Providers
DOI:
https://doi.org/10.1609/icwsm.v20i1.42653Abstract
The credibility of news obtained online has become a concern due to the ease with which individuals or groups can claim to be news publishers and share news-related content. Unfortunately, research on monitoring misleading information in the online news ecosystem is hindered because the community lacks a comprehensive and up-to-date list of social media pages and domains claiming to be news media. This paper employs an automated approach that uses Google's GNews API and Meta's CrowdTangle API to identify self-proclaimed news providers. Our method was able to discover 19k self-proclaimed news providers in the United States active in June 2022 and 23k active in October 2020. Additionally, we retrieve the posting history (totaling 191,182,320 posts) of discovered pages. Among others, our analysis reveals that, on average, 300 new self-proclaimed news pages are created every four months, 56% of them do not declare a managing organization, 15% of the identified news pages are news aggregators, and 57% declare to be local news.Downloads
Published
2026-05-25
How to Cite
Chouaki, S., Nguyen, M.-K., Edelson, L., Goga, O., Lauinger, T., & McCoy, D. (2026). NewsDB: An Automated Approach to Build an Extensive Database of Self-Proclaimed News Providers. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 548–562. https://doi.org/10.1609/icwsm.v20i1.42653
Issue
Section
Full Papers