Epidemic Intelligence for the Crowd, by the Crowd
Keywords:Learning to Rank, Recommender Systems, Outbreak Detection, Social Media Stream, Twitter
Tracking Twitter for public health has shown great potential. However, most recent work has been focused on correlating Twitter messages to influenza rates, a disease that exhibits a marked seasonal pattern. In the presence of sudden outbreaks, how can social media streams be used to strengthen surveillance capacity? In May 2011, Germany reported an outbreak of Enterohemorrhagic Escherichia coli (EHEC). It was one of the largest described outbreaks of EHEC worldwide and the largest in Germany. In this work, we study the crowd's behavior in Twitter during the outbreak. In particular, we report how tracking Twitter helped to detect key user messages that triggered signal detection alarms before MedISys and other well established early warning systems. We also introduce a personalized learning to rank approach that exploits the relationships discovered by: (i) latent semantic topics computed using Latent Dirichlet Allocation (LDA), and (ii) observing the social tagging behavior in Twitter, to rank tweets for epidemic intelligence. Our results provide the grounds for new public health research based on social media.