Principled Analysis of Deep Reinforcement Learning Evaluation and Design Paradigms

Authors

  • Ezgi Korkmaz

DOI:

https://doi.org/10.1609/aaai.v40i27.39427

Abstract

Starting from the utilization of deep neural networks to approximate the state-action value function that led to winning one of the most challenging games, to algorithmic advancements that allowed solving problems without even explicitly stating the rules of the challenge at hand, reinforcement learning research has been the center of remarkable scientific progress for the past decade. In this paper, we focus on the key ingredients of this research progress and we analyze the canonical evaluation and design paradigms in reinforcement learning.
We introduce the theoretical foundations of scaling laws in reinforcement learning and show that
the asymptotic performance of reinforcement learning algorithms does not have a monotone relationship between performance rankings and data-regimes. We conduct large-scale experiments and our results demonstrate that a line of reinforcement learning research under the canonical design and evaluation paradigms resulted in incorrect conclusions. Our analysis and results provide a core analysis on scaling, capacity and complexity of deep reinforcement learning.

Downloads

Published

2026-03-14

How to Cite

Korkmaz, E. (2026). Principled Analysis of Deep Reinforcement Learning Evaluation and Design Paradigms. Proceedings of the AAAI Conference on Artificial Intelligence, 40(27), 22662-22670. https://doi.org/10.1609/aaai.v40i27.39427

Issue

Section

AAAI Technical Track on Machine Learning IV