Metelli, A. M. (2021) “Policy Optimization as Online Learning with Mediator Feedback”, Proceedings of the AAAI Conference on Artificial Intelligence, 35(10), pp. 8958–8966. doi: 10.1609/aaai.v35i10.17083.