(1)
Xiong, G.; Tambe, M. VORTEX: Aligning Task Utility and Human Preferences Through LLM-Guided Reward Shaping. AAAI 2026, 40, 27162-27170.