Designing Safety Specifications for Clinical AI: A Case Study
DOI:
https://doi.org/10.1609/aaaiss.v7i1.36898Abstract
Clinical AI models increasingly inform care decisions, yet implicit assumptions about data timing, label semantics, calibration, and operating thresholds are rarely specified or monitored, causing subtle failures with standard metrics. We present executable safety contracts, lightweight, task-level specifications enforced as runtime checks for hospital length-of-stay prediction. The specifications capture preconditions (data integrity, index-time alignment, censoring), postconditions (admissible outputs, alert-budget bounds), and invariants (coverage/calibration targets, subgroup equity). We implement these checks in a Python pipeline and evaluate them on a single-center MIMIC-IV cohort and a multi-center eICU-style cohort using simple baselines (logistic regression, gradient boosting) with conformal intervals and post-hoc calibration. The contracts exposed hazards that MAE (Mean Absolute Error), AUC (Area Under the ROC Curve), or ECE (Expected Calibration Error) alone missed, for example, acceptable point error with severe under-coverage in eICU, well-calibrated probabilities that nonetheless violated alert-rate constraints, and dataset-specific fairness gaps. Lightweight remedies such as conformal radius tuning, threshold/alert-scope selection, and calibration often restored compliance without degrading point performance, while clarifying when deeper modeling or policy changes were needed. Overall, the case study shows that Design by Contract principles extend beyond APIs to system-level specifications for clinical ML, providing a practical way to state safety expectations, check them with minimal compute, and make violations actionable.Downloads
Published
2025-11-23
How to Cite
Ahmed, S. (2025). Designing Safety Specifications for Clinical AI: A Case
Study. Proceedings of the AAAI Symposium Series, 7(1), 294-302. https://doi.org/10.1609/aaaiss.v7i1.36898
Issue
Section
Engineering Safety-Critical AI Systems