(1)

Efroni, Y.; Merlis, N.; Mannor, S. Reinforcement Learning With Trajectory Feedback. AAAI 2021, 35, 7288-7295.