Advancing Safe Mechanical Ventilation Using Offline RL with Hybrid Actions and Clinically Aligned Rewards

Muhammad Hamza Yousuf; Jason Li; Sahar Vahdati; Raphael Theilen; Jakob Wittenstein; Jens Lehmann

doi:10.1609/aaai.v40i46.41306

Authors

Muhammad Hamza Yousuf Institut für Angewandte Informatik (InfAI)
Jason Li Institut für Angewandte Informatik (InfAI)
Sahar Vahdati Institut für Angewandte Informatik (InfAI) TIB - Leibniz Information Centre for Science and Technology, Hannover, Germany Leibniz Universität Hannover
Raphael Theilen Department of Anesthesiology and Intensive Care Medicine, University Hospital Carl Gustav Carus, Dresden, Germany
Jakob Wittenstein Department of Anesthesiology and Intensive Care Medicine, University Hospital Carl Gustav Carus, Dresden, Germany
Jens Lehmann Institut für Angewandte Informatik (InfAI) Amazon (work done outside org)

DOI:

https://doi.org/10.1609/aaai.v40i46.41306

Abstract

Invasive mechanical ventilation (MV) is a life-sustaining therapy commonly used in the intensive care unit (ICU) for patients with severe and acute conditions. These patients frequently rely on MV for breathing. Given the high risk of death in such cases, optimal MV settings can reduce mortality, minimize ventilator-induced lung injury, shorten ICU stays, and ease the strain on healthcare resources. However, optimizing MV settings remains a complex and error-prone process due to patient-specific variability. While Offline Reinforcement Learning (RL) shows promise for optimizing MV settings, current methods struggle with the hybrid (continuous and discrete) nature of MV settings. Discretizing continuous settings leads to exponential growth in the action space, which limits the number of optimizable settings. Converting the predictions back to continuous can cause a distribution shift, compromising safety and performance. To address this challenge, in the IntelliLung project, we are developing an AI-based approach where we constrain the action space and employ factored action critics. This approach allows us to scale to six optimizable settings compared to 2-3 in previous studies. We adapt SOTA offline RL algorithms to operate directly on hybrid action spaces, avoiding the pitfalls of discretization. We also introduce a clinically grounded reward function based on ventilator-free days and physiological targets. Using multi-objective optimization for reward selection, we show that this leads to a more equitable consideration of all clinically relevant objectives. Notably, we develop a system in close collaboration with healthcare professionals that is aligned with real-world clinical objectives and designed with future deployment in mind.

Advancing Safe Mechanical Ventilation Using Offline RL with Hybrid Actions and Clinically Aligned Rewards

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information