Predicting News Coverage of Scientific Articles
Keywords:computational journalism, science news, learning-to-rank, lambdaMART
Journalists act as gatekeepers to the scientific world, controlling what information reaches the public eye and how it is presented. Analyzing the kinds of research that typically receive more media attention is vital to understanding issues such as the “science of science communication” (National Academies of Sciences, Engineering, and Medicine 2017), patterns of misinformation, and the “cycle of hype.” We track the coverage of 91,997 scientific articles published in 2016 across various disciplines, publishers, and news outlets using metadata and text data from a leading tracker of scientific coverage in social and traditional media, Altmetric. We approach the problem as one of ranking each day’s, or week’s, papers by their likely level of media attention, using the learning-to-rank model lambdaMART (Burges 2010). We find that ngram features from the title, abstract and press release significantly improve performance over the metadata features journal, publisher, and subjects.