[1]
Zhang, Z. and Zhang, B. 2026. When Instinct Guides and Insight Grounds: Staged RL Training for LLM Agents. Proceedings of the AAAI Conference on Artificial Intelligence. 40, 41 (Mar. 2026), 34906–34914. DOI:https://doi.org/10.1609/aaai.v40i41.40794.