Predicting Opioid Overdose Crude Rates with Text-Based Twitter Features (Student Abstract)
Drug use reporting is often a bottleneck for modern public health surveillance; social media data provides a real-time signal which allows for tracking and monitoring opioid overdoses. In this work we focus on text-based feature construction for the prediction task of opioid overdose rates at the county level. More specifically, using a Twitter dataset with over 3.4 billion tweets, we explore semantic features, such as topic features, to show that social media could be a good indicator for forecasting opioid overdose crude rates in public health monitoring systems. Specifically, combining topic and TF-IDF features in conjunction with demographic features can predict opioid overdose rates at the county level.