[1]
N.-C. Huang, P.-C. Hsieh, K.-H. Ho, and I.-C. Wu, “PPO-Clip Attains Global Optimality: Towards Deeper Understandings of Clipping”, AAAI, vol. 38, no. 11, pp. 12600–12607, Mar. 2024.