Non-exponential Reward Discounting in Reinforcement Learning


  • Raja Farrukh Ali Kansas State University



Reinforcement Learning, Reward Discounting, Multi Agent Reinforcement Learning, Generlization In RL, Survival Analysis


Reinforcement learning methods typically discount future rewards using an exponential scheme to achieve theoretical convergence guarantees. Studies from neuroscience, psychology, and economics suggest that human and animal behavior is better captured by the hyperbolic discounting model. Hyperbolic discounting has recently been studied in deep reinforcement learning and has shown promising results. However, this area of research is seemingly understudied, with most extant and continuing research using the standard exponential discounting formulation. My dissertation examines the effects of non-exponential discounting functions (such as hyperbolic) on an agent's learning and aims to investigate their impact on multi-agent systems and generalization tasks. A key objective of this study is to link the discounting rate to an agent's approximation of the underlying hazard rate of its environment through survival analysis.




How to Cite

Ali, R. F. (2023). Non-exponential Reward Discounting in Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(13), 16111-16112.