Clustering Longitudinal Clinical Marker Trajectories from Electronic Health Data: Applications to Phenotyping and Endotype Discovery

Peter Schulam; Fredrick Wigley; Suchi Saria

doi:10.1609/aaai.v29i1.9537

Authors

Peter Schulam Johns Hopkins University
Fredrick Wigley Johns Hopkins School of Medicine
Suchi Saria Johns Hopkins University

DOI:

https://doi.org/10.1609/aaai.v29i1.9537

Keywords:

Machine Learning, Computational Medicine, Computational Phenotyping, Computational Endotyping, Time Series, Latent Variable Models, Disease Subtyping, Patient Similarity

Abstract

Diseases such as autism, cardiovascular disease, and the autoimmune disorders are difficult to treat because of the remarkable degree of variation among affected individuals. Subtyping research seeks to refine the definition of such complex, multi-organ diseases by identifying homogeneous patient subgroups. In this paper, we propose the Probabilistic Subtyping Model (PSM) to identify subgroups based on clustering individual clinical severity markers. This task is challenging due to the presence of nuisance variability — variations in measurements that are not due to disease subtype — which, if not accounted for, generate biased estimates for the group-level trajectories. Measurement sparsity and irregular sampling patterns pose additional challenges in clustering such data. PSM uses a hierarchical model to account for these different sources of variability. Our experiments demonstrate that by accounting for nuisance variability, PSM is able to more accurately model the marker data. We also discuss novel subtypes discovered using PSM and the resulting clinical hypotheses that are now the subject of follow up clinical experiments.

Clustering Longitudinal Clinical Marker Trajectories from Electronic Health Data: Applications to Phenotyping and Endotype Discovery

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription