[1]
Z. He, P. Qiao, R. Li, Y. Dou, and Y. Tan, “Highly Parallelized Reinforcement Learning Training with Relaxed Assignment Dependencies”, AAAI, vol. 39, no. 16, pp. 17159–17167, Apr. 2025.