Real-Time Recurrent Reinforcement Learning
DOI:
https://doi.org/10.1609/aaai.v39i17.34001Abstract
We introduce a biologically plausible RL framework for solving tasks in partially observable Markov decision processes (POMDPs). The proposed algorithm combines three integral parts: (1) A Meta-RL architecture, resembling the mammalian basal ganglia; (2) A biologically plausible reinforcement learning algorithm, exploiting temporal difference learning and eligibility traces to train the policy and the value-function; (3) An online automatic differentiation algorithm for computing the gradients with respect to parameters of a shared recurrent network backbone. Our experimental results show that the method is capable of solving a diverse set of partially observable reinforcement learning tasks. The algorithm we call real-time recurrent reinforcement learning (RTRRL) serves as a model of learning in biological neural networks, mimicking reward pathways in the basal ganglia.Downloads
Published
2025-04-11
How to Cite
Lemmel, J., & Grosu, R. (2025). Real-Time Recurrent Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 39(17), 18189–18197. https://doi.org/10.1609/aaai.v39i17.34001
Issue
Section
AAAI Technical Track on Machine Learning III