Reward Design for an Online Reinforcement Learning Algorithm Supporting Oral Self-Care

Authors

  • Anna L. Trella Harvard University
  • Kelly W. Zhang Harvard University
  • Inbal Nahum-Shani University of Michigan
  • Vivek Shetty University of California, Los Angeles
  • Finale Doshi-Velez Harvard University
  • Susan A. Murphy Harvard University

DOI:

https://doi.org/10.1609/aaai.v37i13.26866

Keywords:

Reinforcement Learning (RL), Reward Design, Online, Mobile Health

Abstract

While dental disease is largely preventable, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors. One of the main challenges in developing such an algorithm is ensuring that the algorithm considers the impact of current actions on the effectiveness of future actions (i.e., delayed effects), especially when the algorithm has been designed to run stably and autonomously in a constrained, real-world setting characterized by highly noisy, sparse data. We address this challenge by designing a quality reward that maximizes the desired health outcome (i.e., high-quality brushing) while minimizing user burden. We also highlight a procedure for optimizing the hyperparameters of the reward by building a simulation environment test bed and evaluating candidates using the test bed. The RL algorithm discussed in this paper will be deployed in Oralytics. To the best of our knowledge, Oralytics is the first mobile health study utilizing an RL algorithm designed to prevent dental disease by optimizing the delivery of motivational messages supporting oral self-care behaviors.

Downloads

Published

2023-09-06

How to Cite

Trella, A. L., Zhang, K. W., Nahum-Shani, I., Shetty, V., Doshi-Velez, F., & Murphy, S. A. (2023). Reward Design for an Online Reinforcement Learning Algorithm Supporting Oral Self-Care. Proceedings of the AAAI Conference on Artificial Intelligence, 37(13), 15724-15730. https://doi.org/10.1609/aaai.v37i13.26866

Issue

Section

IAAI Technical Track on emerging Applications of AI