Towards Better Correctness and Efficiency in Code Generation

Authors

  • Yunlong Feng Alibaba Group
  • Yang Xu Alibaba Group
  • Xiao Xu Alibaba Group
  • Binyuan Hui Alibaba Group
  • Junyang Lin Alibaba Group

DOI:

https://doi.org/10.1609/aaai.v40i36.40327

Abstract

While code large language models have demonstrated remarkable progress in code generation, the generated code often exhibits poor runtime efficiency, limiting its practical application in performance-sensitive scenarios. To address this limitation, we propose an efficiency-oriented reinforcement learning framework guided by a novel performance reward. Based on this framework, we take a deeper dive into the code efficiency problem, identifying then proposing methods to overcome key bottlenecks: (1) Dynamic exploration overcomes the static data constraints of offline fine-tuning, enabling the discovery of more efficient code implementations. (2) The error-insensitive reinforcement learning method and high-contrast efficiency signals are crucial for mitigating systematic errors and achieving effective optimization. (3) Online exploration is most effective when starting from a high-correctness baseline, as this allows for efficiency improvements without sacrificing accuracy. With these discoveries, we finally propose a two-stage tuning method, which achieves high and balanced performance across correctness and efficiency. The results of experiments show the effectiveness of the method, which improves code correctness by 10.18% and runtime efficiency by 7.75% on a 7B model, achieving performance comparable to much larger model.

Downloads

Published

2026-03-14

How to Cite

Feng, Y., Xu, Y., Xu, X., Hui, B., & Lin, J. (2026). Towards Better Correctness and Efficiency in Code Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(36), 30708-30716. https://doi.org/10.1609/aaai.v40i36.40327

Issue

Section

AAAI Technical Track on Natural Language Processing I