Multi-Objective MDPs with Conditional Lexicographic Reward Preferences

Kyle Wray; Shlomo Zilberstein; Abdel-Illah Mouaddib

doi:10.1609/aaai.v29i1.9647

Multi-Objective MDPs with Conditional Lexicographic Reward Preferences

Authors

Kyle Wray University of Massachusetts, Amherst
Shlomo Zilberstein University of Massachusetts, Amherst
Abdel-Illah Mouaddib University of Caen

DOI:

https://doi.org/10.1609/aaai.v29i1.9647

Keywords:

multi-objective, momdp, lmdp, mdp, lexicographic preferences

Abstract

Sequential decision problems that involve multiple objectives are prevalent. Consider for example a driver of a semi-autonomous car who may want to optimize competing objectives such as travel time and the effort associated with manual driving. We introduce a rich model called Lexicographic MDP (LMDP) and a corresponding planning algorithm called LVI that generalize previous work by allowing for conditional lexicographic preferences with slack. We analyze the convergence characteristics of LVI and establish its game theoretic properties. The performance of LVI in practice is tested within a realistic benchmark problem in the domain of semi-autonomous driving. Finally, we demonstrate how GPU-based optimization can improve the scalability of LVI and other value iteration algorithms for MDPs.

Downloads

Published

2015-03-04

How to Cite

Wray, K., Zilberstein, S., & Mouaddib, A.-I. (2015). Multi-Objective MDPs with Conditional Lexicographic Reward Preferences. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9647

Download Citation

Issue

Vol. 29 No. 1 (2015): Twenty-Ninth AAAI Conference on Artificial Intelligence

Section

Main Track: Planning and Scheduling

Multi-Objective MDPs with Conditional Lexicographic Reward Preferences

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information