Optimizing Vital Sign Monitoring in Resource-Constrained Maternal Care: An RL-Based Restless Bandit Approach

Niclas Boehmer; Yunfan Zhao; Guojun Xiong; Paula Rodriguez-Diaz; Paola Del Cueto  Cibrian; Joseph Ngonzi; Adeline Boatin; Milind Tambe

doi:10.1609/aaai.v39i28.35149

Authors

Niclas Boehmer School of Engineering and Applied Sciences, Harvard University, USA Hasso Plattner Institute, University of Potsdam, Germany
Yunfan Zhao School of Engineering and Applied Sciences, Harvard University, USA GE Healthcare, USA
Guojun Xiong School of Engineering and Applied Sciences, Harvard University, USA
Paula Rodriguez-Diaz School of Engineering and Applied Sciences, Harvard University, USA
Paola Del Cueto Cibrian Department of Obstetrics and Gynecology, Massachusetts General Hospital, Harvard Medical School, USA
Joseph Ngonzi Mbarara University of Science and Technology, Uganda
Adeline Boatin Department of Obstetrics and Gynecology, Massachusetts General Hospital, Harvard Medical School, USA
Milind Tambe School of Engineering and Applied Sciences, Harvard University, USA

DOI:

https://doi.org/10.1609/aaai.v39i28.35149

Abstract

Maternal mortality remains a significant global public health challenge. One promising approach to reducing maternal deaths occurring during facility-based childbirth is through early warning systems, which require the consistent monitoring of mothers' vital signs after giving birth. Wireless vital sign monitoring devices offer a labor-efficient solution for continuous monitoring, but their scarcity raises the critical question of how to allocate them most effectively. We devise an allocation algorithm for this problem by modeling it as a variant of the popular Restless Multi-Armed Bandit (RMAB) paradigm. In doing so, we identify and address novel, previously unstudied constraints unique to this domain, which render previous approaches for RMABs unsuitable and significantly increase the complexity of the learning and planning problem. To overcome these challenges, we adopt the popular Proximal Policy Optimization (PPO) algorithm from reinforcement learning to learn an allocation policy by training a policy and value function network. We demonstrate in simulations that our approach outperforms the best heuristic baseline by up to a factor of 4.

Optimizing Vital Sign Monitoring in Resource-Constrained Maternal Care: An RL-Based Restless Bandit Approach

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information