AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code

Jianqing Zhang; Wei Xia; Hande Dong; Qiang Lin; Jian Cao

doi:10.1609/aaai.v40i41.40771

AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code

Authors

Jianqing Zhang Shanghai Jiao Tong University
Wei Xia Tencent
Hande Dong Tencent
Qiang Lin Tencent
Jian Cao Shanghai Jiao Tong University

DOI:

https://doi.org/10.1609/aaai.v40i41.40771

Abstract

LLM's code generation capabilities have yielded substantial improvements in the effectiveness of programming tasks. However, LLM-generated code still suffers from compilation and runtime errors. Existing offline preference optimization methods primarily focus on enhancing LLMs' coding abilities using pass/fail signals in the preference data, overlooking the deep-level error types in the failed codes. To address this, we propose Adaptively Progressive Preference Optimization (AP2O) for coding (i.e., AP2O-Coder), a method that guides LLMs adaptively and methodically to reduce code errors for code generation. Specifically, we construct an error notebook from failed codes and progressively optimize the LLM to correct errors type by type. Furthermore, we adaptively replay error types to tailor to the LLM's evolving weaknesses throughout training. Through extensive experiments on both code and general LLMs (Llama, Qwen, and DeepSeek series) with parameters ranging from 0.5B to 34B, our AP2O-Coder improves code generation performance by up to 3% in pass@k while using less preference data.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

Published

2026-03-14

How to Cite

Zhang, J., Xia, W., Dong, H., Lin, Q., & Cao, J. (2026). AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 34701–34709. https://doi.org/10.1609/aaai.v40i41.40771

Download Citation

Issue

Vol. 40 No. 41: AAAI-26 Technical Tracks 41

Section

AAAI Technical Track on Natural Language Processing VI

AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information