Yang, N., Wang, P., Liu, G., Zhang, H., Lyu, P., & Wang, J. (2026). Proactive Constrained Policy Optimization with Preemptive Penalty. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 27583–27591. https://doi.org/10.1609/aaai.v40i32.39978