Personalized Reinforcement Learning with a Budget of Policies

Authors

  • Dmitry Ivanov Technion, Israel
  • Omer Ben-Porat Technion, Israel

DOI:

https://doi.org/10.1609/aaai.v38i11.29169

Keywords:

ML: Reinforcement Learning, MAS: Multiagent Learning

Abstract

Personalization in machine learning (ML) tailors models' decisions to the individual characteristics of users. While this approach has seen success in areas like recommender systems, its expansion into high-stakes fields such as healthcare and autonomous driving is hindered by the extensive regulatory approval processes involved. To address this challenge, we propose a novel framework termed represented Markov Decision Processes (r-MDPs) that is designed to balance the need for personalization with the regulatory constraints. In an r-MDP, we cater to a diverse user population, each with unique preferences, through interaction with a small set of representative policies. Our objective is twofold: efficiently match each user to an appropriate representative policy and simultaneously optimize these policies to maximize overall social welfare. We develop two deep reinforcement learning algorithms that efficiently solve r-MDPs. These algorithms draw inspiration from the principles of classic K-means clustering and are underpinned by robust theoretical foundations. Our empirical investigations, conducted across a variety of simulated environments, showcase the algorithms' ability to facilitate meaningful personalization even under constrained policy budgets. Furthermore, they demonstrate scalability, efficiently adapting to larger policy budgets.

Published

2024-03-24

How to Cite

Ivanov, D., & Ben-Porat, O. (2024). Personalized Reinforcement Learning with a Budget of Policies. Proceedings of the AAAI Conference on Artificial Intelligence, 38(11), 12735–12743. https://doi.org/10.1609/aaai.v38i11.29169

Issue

Section

AAAI Technical Track on Machine Learning II