1.
Zhang H, Lin Y, Shen S, Han S, Lv K. Enhancing Off-Policy Constrained Reinforcement Learning through Adaptive Ensemble C Estimation. AAAI [Internet]. 2024Mar.24 [cited 2024Nov.15];38(19):21770-8. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/30177