(1)
Huang, H.; Yang, Y.; Sun, H.; Li, J.; Gao, Y. Simulated Rewards, Skewed Strategies: Tracing the Acquired Preference Bias in LLM-Based Dialogue Planners. AAAI 2026, 40, 21948-21956.