Improving the Reliability of Medical Diagnostic Models through Rule-Based Decision Deferral

Authors

  • Jacqueline Isabel Bereska Amsterdam UMC, location University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands Cancer Center Amsterdam, Amsterdam, The Netherlands Amsterdam UMC, location University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
  • Henk Marquering Amsterdam UMC, location University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands Cancer Center Amsterdam, Amsterdam, The Netherlands Amsterdam UMC, location University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
  • Marc Besselink Cancer Center Amsterdam, Amsterdam, The Netherlands Amsterdam UMC, location Free University of Amsterdam, Department of Surgery, Amsterdam, The Netherlands
  • Jaap Stoker Amsterdam UMC, location University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands Cancer Center Amsterdam, Amsterdam, The Netherlands
  • Inez Verpalen Amsterdam UMC, location University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands Cancer Center Amsterdam, Amsterdam, The Netherlands

DOI:

https://doi.org/10.1609/aaaiss.v1i1.27488

Keywords:

Rule-Based Decision Deferral, Pancreatic Ductal Adenocarcinoma, Tumor Resectability, Segmentation Model, Model Uncertainty, Uncertainty, Model Reliability, Computed Tomography Scans, Medical Imaging, Medical Diagnostic Modeling

Abstract

Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal cancer, and accurate assessment of tumor resectability is crucial for determining appropriate treatment. AI-based models have shown promise in classifying tumor resectability, but reliability concerns have impeded clinical implementation. We propose extending the AI-based VasQNet model for classifying tumor resectability on AI-generated segmentations of computed tomography scans (CTs) to improve the models’ reliability. This extension allows VasQNet to defer decisions when the AI-generated segmentations violate pre-established rules on vascular anatomy, tumor location, and tumor size. We conducted experiments using CTs of (borderline) resectable and non-resectable PDAC patients. We evaluated the performance of the baseline VasQNet and the extended VasQNet with rule-based decision deferral (RBDD) by comparing their classifications to a ground-truth provided by a radiologist, employing agreement as a metric. Our results demonstrate that the extended VasQNet achieved a significantly higher agreement (90%) with the radiologist’s classification than the baseline VasQNet (67%). Notably, 17/31 (54%) deferred decisions would have been incorrect had they not been deferred. Our study demonstrates the effectiveness of RBDD in improving the reliability of clinical diagnostic models through the exemplification of VasQNet. In conclusion, RBDD can enhance the reliability of clinical diagnostics models, facilitating integration into clinical practice. The documented code is available on GitHub (https://github.com/PHAIR-Consortium/Vessel-Involvement-Quantifier).

Downloads

Published

2023-10-03

Issue

Section

Second Symposium on Human Partnership with Medical AI: Design, Operationalization, and Ethics