Beyond Markov Decision Process with Scalar Markovian Rewards
DOI:
https://doi.org/10.1609/socs.v15i1.21805Keywords:
Problem Solving Using SearchAbstract
Real-world decision problems often involve multiple competing objectives or a complex reward structure that violate Markov assumption. However, the existing research on sequential decision making under uncertainty primarily focused on Markov Decision Processes (MDPs) with scalar Markovian reward signals. My thesis considers settings where scalar Markovian rewards are not sufficient to produce desired behaviors. The first part of my thesis develops algorithms to optimize lexicographically ordered objectives. The second part considers autonomous agents which incorporate the perspective of their observer. As the perspective of the observer can depend on how the agents behaved so far, rewards in this setting can depend on histories (non-Markovian). In the final part of my thesis, I hope to characterize when rewards beyond scalar Markovian signals are needed from the decision theoretic perspectiveDownloads
Published
2022-07-17
Issue
Section
Student Papers