(1)
Wang, H. Efficient and Robust Reinforcement Learning from Human Feedback. AAAI 2025, 39, 28730-28730.