(1)

Wang, H. Efficient and Robust Reinforcement Learning from Human Feedback. AAAI 2025, 39, 28730-28730.