[1]
Z. Zhang and B. Zhang, “When Instinct Guides and Insight Grounds: Staged RL Training for LLM Agents”, AAAI, vol. 40, no. 41, pp. 34906–34914, Mar. 2026.