nEmesis: Which Restaurants Should You Avoid Today?
DOI:
https://doi.org/10.1609/hcomp.v1i1.13069Keywords:
foodborne disease, language modelling, online social media, organic sensor networksAbstract
Computational approaches to health monitoring and epidemiology continue to evolve rapidly. We present an end-to-end system, nEmesis, that automatically identifies restaurants posing public health risks. Leveraging a language model of Twitter users' online communication, nEmesis finds individuals who are likely suffering from a foodborne illness. People's visits to restaurants are modeled by matching GPS data embedded in the messages with restaurant addresses. As a result, we can assign each venue a "health score" based on the proportion of customers that fell ill shortly after visiting it. Statistical analysis reveals that our inferred health score correlates (r = 0.30) with the official inspection data from the Department of Health and Mental Hygiene (DOHMH). We investigate the joint associations of multiple factors mined from online data with the DOHMH violation scores and find that over 23% of variance can be explained by our factors. We demonstrate that readily accessible online data can be used to detect cases of foodborne illness in a timely manner. This approach offers an inexpensive way to enhance current methods to monitor food safety (e.g., adaptive inspections) and identify potentially problematic venues in near-real time.