Going Topological in Multi-risk Extended Markov Ratio Decision Processes

Alexander Zadorojniy; Orit Davidovich; Takayuki Osogami

doi:10.1609/icaps.v35i1.36108

Authors

Alexander Zadorojniy IBM Research - Israel
Orit Davidovich IBM Research - Israel
Takayuki Osogami IBM Research - Tokyo

DOI:

https://doi.org/10.1609/icaps.v35i1.36108

Abstract

Incorporating risk into decision making is natural, if one is to address safety concerns or operational limitations. In the context of risk-aware Markov Decision Processes (MDPs), one identifies a notion of risk which is uncertainty driven (e.g., CVaR). Risk, however, may also be inherent to the MDP setup itself, i.e., to taking certain types of actions. In that case, we would consider a decision policy to be better, if it either increases reward or reduces risk (or both). A simple mathematical formulation that expresses such a notion of improvement is the ratio of reward over risk. Though intuitive, this ratio is inherently non-linear, which introduces challenges for optimization. We provide an algorithm that solves this non-linear problem in the context of multiple risk aspects, extending upon single-risk Extended Markov Ratio Decision Processes (EMRDPs). We show that it is strongly polynomial under a monotonicity assumption over actions, satisfied, for example, in financial market applications (e.g., Quasi-Sharpe Ratio). We tackle non-linearity by integrating Walkup-Wets' topological view of parametric LPs. This topological framework highlights the non-trivial move from a single (EMRDP) to multiple risk aspects, once it is interpreted as moving from triangulations of 1-dimensional to those of m-dimensional polyhedra, with all the topological (and combinatorial) complexities this entails.

Going Topological in Multi-risk Extended Markov Ratio Decision Processes

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information