Cannot Predict Comment Volume of a News Article before (a few) Users Read It


  • Lihong He Temple University
  • Chen Shen Temple University
  • Arjun Mukherjee University of Houston
  • Slobodan Vucetic Temple University
  • Eduard Dragut Temple University



Measuring predictability of real world phenomena based on social media, e.g., spanning politics, finance, and health, Engagement, motivations, incentives, and gamification.


Many news outlets allow users to contribute comments on topics about daily world events. News articles are the seeds that spring users' interest to contribute content, i.e., comments. An article may attract an apathetic user engagement (several tens of comments) or a spontaneous fervent user engagement (thousands of comments). In this paper, we study the problem of predicting the total number of user comments a news article will receive. Our main insight is that user-to-user interaction factors contribute the most to an accurate prediction, while news article specific factors have surprisingly little influence. This appears to be an interesting and understudied phenomenon: collective social behavior at a news outlet shapes user response and may even downplay the content of an article. We compile and analyze a large number of features, both old and novel from literature. The features span a broad spectrum of facets including news article and comment contents, temporal dynamics, sentiment/linguistic features, and user behaviors. We show that the early arrival rate of comments is the best indicator of the eventual number of comments. We conduct an in-depth analysis of this feature across several dimensions, such as news outlets and news article categories. We show that the relationship between the early rate and the final number of comments as well as the prediction accuracy vary considerably across news outlets and news article categories (e.g., politics, sports, or health).




How to Cite

He, L., Shen, C., Mukherjee, A., Vucetic, S., & Dragut, E. (2021). Cannot Predict Comment Volume of a News Article before (a few) Users Read It. Proceedings of the International AAAI Conference on Web and Social Media, 15(1), 173-184.