CliCARE: Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records

Authors

  • Dongchen Li College of Computer Science and Engineering, Northeastern University, Shenyang, China
  • Jitao Liang College of Computer Science and Engineering, Northeastern University, Shenyang, China
  • Wei Li National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China Key Laboratory of Intelligent Computing in Medical Image (MIIC), Northeastern University, Shenyang, China
  • Xiaoyu Wang Liaoning Cancer Hospital & Institute, Shenyang, China
  • Longbing Cao Macquarie University, Sydney, Australia
  • Kun Yu College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China

DOI:

https://doi.org/10.1609/aaai.v40i37.40421

Abstract

Large Language Models (LLMs) hold significant promise for improving clinical decision support and reducing physician burnout by synthesizing complex, longitudinal cancer Electronic Health Records (EHRs). However, their implementation in this critical field faces three primary challenges: the inability to effectively process the extensive length and fragmented nature of patient records for accurate temporal analysis; a heightened risk of clinical hallucination, as conventional grounding techniques such as Retrieval-Augmented Generation (RAG) do not adequately incorporate process-oriented clinical guidelines; and unreliable evaluation metrics that hinder the validation of AI systems in oncology. To address these issues, we propose CliCARE, a framework for Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records. The framework operates by transforming unstructured, longitudinal EHRs into patient-specific Temporal Knowledge Graphs (TKGs) to capture long-range dependencies, and then grounding the decision support process by aligning these real-world patient trajectories with a normative guideline knowledge graph. This approach provides oncologists with evidence-grounded decision support by generating a high-fidelity clinical summary and an actionable recommendation. We validated our framework using large-scale, longitudinal data from a private Chinese cancer dataset and the public English MIMIC-IV dataset. In these settings, CliCARE significantly outperforms baselines, including leading long-context LLMs and Knowledge Graph-enhanced RAG methods. The clinical validity of our results is supported by a robust evaluation protocol, which demonstrates a high correlation with assessments made by oncologists.

Published

2026-03-14

How to Cite

Li, D., Liang, J., Li, W., Wang, X., Cao, L., & Yu, K. (2026). CliCARE: Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records. Proceedings of the AAAI Conference on Artificial Intelligence, 40(37), 31554-31562. https://doi.org/10.1609/aaai.v40i37.40421

Issue

Section

AAAI Technical Track on Natural Language Processing II