NELA-Local: A Dataset of U.S. Local News Articles for the Study of County-Level News Ecosystems

Authors

  • Benjamin D. Horne School of Information Sciences, The University of Tennessee Knoxville
  • Maurício Gruppi Computer Science, Rensselaer Polytechnic Institute
  • Kenneth Joseph Computer Science and Engineering, University at Buffalo
  • Jon Green Network Science Institute, Northeastern University
  • John P. Wihbey School of Journalism and Media Innovation, Northeastern University
  • Sibel Adalı Computer Science, Rensselaer Polytechnic Institute

Keywords:

Analysis of the relationship between social media and mainstream media, Credibility of online content, Trend identification and tracking; time series forecasting

Abstract

In this paper, we present a dataset of over 1.4M online news articles from 313 local U.S. news outlets published over 20 months (between April 4th, 2020 and December 31st, 2021). These outlets cover a geographically diverse set of communities across the United States. In order to estimate characteristics of the local audience, included with this news article data is a wide range of county-level metadata, including demographics, 2020 Presidential Election vote shares, and community resilience estimates from the U.S. Census Bureau. The NELA-Local dataset can be found at: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GFE66K.

Downloads

Published

2022-05-31

How to Cite

Horne, B. D., Gruppi, M., Joseph, K., Green, J., Wihbey, J. P., & Adalı, S. (2022). NELA-Local: A Dataset of U.S. Local News Articles for the Study of County-Level News Ecosystems. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 1275-1284. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/19379