Efroni, Yonathan, Nadav Merlis, and Shie Mannor. 2021. “Reinforcement Learning With Trajectory Feedback”. Proceedings of the AAAI Conference on Artificial Intelligence 35 (8):7288-95. https://doi.org/10.1609/aaai.v35i8.16895.