Towards Effective Code-Integrated Reasoning

Authors

  • Fei Bai Gaoling School of Artificial Intelligence, Renmin University of China Beijing Key Laboratory of Research on Large Models and Intelligent Governance
  • Yingqian Min Gaoling School of Artificial Intelligence, Renmin University of China Beijing Key Laboratory of Research on Large Models and Intelligent Governance
  • Beichen Zhang Gaoling School of Artificial Intelligence, Renmin University of China Beijing Key Laboratory of Research on Large Models and Intelligent Governance
  • Zhipeng Chen Gaoling School of Artificial Intelligence, Renmin University of China Beijing Key Laboratory of Research on Large Models and Intelligent Governance
  • Xin Zhao Gaoling School of Artificial Intelligence, Renmin University of China Beijing Key Laboratory of Research on Large Models and Intelligent Governance
  • Lei Fang DataCanvas Alaya NeW
  • Zheng Liu BAAI
  • Zhongyuan Wang BAAI
  • Hongteng Xu Gaoling School of Artificial Intelligence, Renmin University of China Beijing Key Laboratory of Research on Large Models and Intelligent Governance

DOI:

https://doi.org/10.1609/aaai.v40i36.40250

Abstract

In this paper, we investigate code-integrated reasoning (CIR), where models generate code when necessary and integrate feedback by executing it through a code interpreter. To acquire this capability, models must learn when and how to use external code tools effectively, which is supported by tool-augmented reinforcement learning (RL). Despite its benefits, tool-augmented RL can still suffer from potential instability in the learning dynamics. In light of this challenge, we present a systematic approach ETIR (Effective TIR) to improving the training effectiveness and stability of tool-augmented RL for code-integrated reasoning. Specifically, we develop enhanced training strategies that balance exploration and stability, progressively building tool-use capabilities while improving reasoning performance. Through extensive experiments on five mainstream mathematical reasoning benchmarks, our model demonstrates significant performance improvements over multiple competitive baselines. Furthermore, we conduct an in-depth analysis of the mechanism of code-integrated reasoning, revealing several key insights, such as the extension of model’s capability boundaries and the simultaneous improvement of reasoning efficiency through code integration. These findings underscore the potential of code-integrated reasoning as a scalable paradigm for advancing robust and efficient language model reasoning.

Downloads

Published

2026-03-14

How to Cite

Bai, F., Min, Y., Zhang, B., Chen, Z., Zhao, X., Fang, L., … Xu, H. (2026). Towards Effective Code-Integrated Reasoning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(36), 30022–30030. https://doi.org/10.1609/aaai.v40i36.40250

Issue

Section

AAAI Technical Track on Natural Language Processing I