He, Zhouyu, Peng Qiao, Rongchun Li, Yong Dou, and Yusong Tan. “Highly Parallelized Reinforcement Learning Training With Relaxed Assignment Dependencies”. Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 16 (April 11, 2025): 17159–17167. Accessed May 13, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/33886.