Zhang, Zijing, and Boning Zhang. 2026. “When Instinct Guides and Insight Grounds: Staged RL Training for LLM Agents”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (41):34906-14. https://doi.org/10.1609/aaai.v40i41.40794.