Non-exponential Reward Discounting in Reinforcement Learning
Keywords:Reinforcement Learning, Reward Discounting, Multi Agent Reinforcement Learning, Generlization In RL, Survival Analysis
AbstractReinforcement learning methods typically discount future rewards using an exponential scheme to achieve theoretical convergence guarantees. Studies from neuroscience, psychology, and economics suggest that human and animal behavior is better captured by the hyperbolic discounting model. Hyperbolic discounting has recently been studied in deep reinforcement learning and has shown promising results. However, this area of research is seemingly understudied, with most extant and continuing research using the standard exponential discounting formulation. My dissertation examines the effects of non-exponential discounting functions (such as hyperbolic) on an agent's learning and aims to investigate their impact on multi-agent systems and generalization tasks. A key objective of this study is to link the discounting rate to an agent's approximation of the underlying hazard rate of its environment through survival analysis.
How to Cite
Ali, R. F. (2023). Non-exponential Reward Discounting in Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(13), 16111-16112. https://doi.org/10.1609/aaai.v37i13.26916
AAAI Doctoral Consortium Track