(1)
Yang, N.; Wang, P.; Liu, G.; Zhang, H.; Lyu, P.; Wang, J. Proactive Constrained Policy Optimization With Preemptive Penalty. AAAI 2026, 40, 27583-27591.