Early Prediction of Diabetes Complications from Electronic Health Records: A Multi-Task Survival Analysis Approach
Keywords:Healthcare, Diabetes, EHR, Survival Analysis, Multi-task Learning
Type 2 diabetes mellitus (T2DM) is a chronic disease that usually results in multiple complications. Early identification of individuals at risk for complications after being diagnosed with T2DM is of significant clinical value. In this paper, we present a new data-driven predictive approach to predict when a patient will develop complications after the initial T2DM diagnosis. We propose a novel survival analysis method to model the time-to-event of T2DM complications designed to simultaneously achieve two important metrics: 1) accurate prediction of event times, and 2) good ranking of the relative risks of two patients. Moreover, to better capture the correlations of time-to-events of the multiple complications, we further develop a multi-task version of the survival model. To assess the performance of these approaches, we perform extensive experiments on patient level data extracted from a large electronic health record claims database. The results show that our new proposed survival analysis approach consistently outperforms traditional survival models and demonstrate the effectiveness of the multi-task framework over modeling each complication independently.