nEmesis: Which Restaurants Should You Avoid Today?


  • Adam Sadilek Google
  • Sean Brennan University of Rochester
  • Henry Kautz University of Rochester
  • Vincent Silenzio University of Rochester



foodborne disease, language modelling, online social media, organic sensor networks


Computational approaches to health monitoring and epidemiology continue to evolve rapidly. We present an end-to-end system, nEmesis, that automatically identifies restaurants posing public health risks. Leveraging a language model of Twitter users' online communication, nEmesis finds individuals who are likely suffering from a foodborne illness. People's visits to restaurants are modeled by matching GPS data embedded in the messages with restaurant addresses. As a result, we can assign each venue a "health score" based on the proportion of customers that fell ill shortly after visiting it. Statistical analysis reveals that our inferred health score correlates (r = 0.30) with the official inspection data from the Department of Health and Mental Hygiene (DOHMH). We investigate the joint associations of multiple factors mined from online data with the DOHMH violation scores and find that over 23% of variance can be explained by our factors. We demonstrate that readily accessible online data can be used to detect cases of foodborne illness in a timely manner. This approach offers an inexpensive way to enhance current methods to monitor food safety (e.g., adaptive inspections) and identify potentially problematic venues in near-real time.




How to Cite

Sadilek, A., Brennan, S., Kautz, H., & Silenzio, V. (2013). nEmesis: Which Restaurants Should You Avoid Today?. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 1(1), 138-146.