Benchmarking Reinforcement Learning Algorithms for ICU Ventilator Settings: An Interpretable and Probabilistic Patient Environment for Doctor Agents

Authors

  • Ya-Hsi Chang National Tsing Hua University, Hsinchu, Taiwan
  • Po-Chih Kuo National Tsing Hua University, Hsinchu, Taiwan

DOI:

https://doi.org/10.1609/aaai.v40i24.39081

Abstract

Mechanical ventilation is essential in intensive care units (ICUs), but prolonged use increases patient risk. Reinforcement learning (RL) offers potential for optimizing ventilator management, yet its clinical adoption is limited by the lack of interpretable and realistic simulation environments. We propose an interpretable and probabilistic patient environment simulator based on action-based k-nearest neighbors and empirical transition probabilities, modeling stochastic state transitions grounded in real ICU data (MIMIC-IV and eICU). The simulator supports anomaly detection and provides probabilistic next-state distributions to enhance transparency and safety. Within this environment, we benchmark seven offline RL algorithms under clinically guided reward designs, including five distinct reward function configurations to explore the impact of reward shaping on agent behavior. Our results show that RL agents such as Double DQN and NFQ outperform empirical physician policies in meeting extubation guidelines, especially for high-severity patients. This benchmark enables standardized, interpretable evaluation of RL-based decision support tools for critical care.

Published

2026-03-14

How to Cite

Chang, Y.-H., & Kuo, P.-C. (2026). Benchmarking Reinforcement Learning Algorithms for ICU Ventilator Settings: An Interpretable and Probabilistic Patient Environment for Doctor Agents. Proceedings of the AAAI Conference on Artificial Intelligence, 40(24), 19970-19977. https://doi.org/10.1609/aaai.v40i24.39081

Issue

Section

AAAI Technical Track on Machine Learning I