[1]
Y. Efroni, N. Merlis, and S. Mannor, “Reinforcement Learning with Trajectory Feedback”, AAAI, vol. 35, no. 8, pp. 7288-7295, May 2021.