SRACG: A Code Generation Framework with Selective Retrieval Augmentation

Authors

  • Mengzhen Wang School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
  • Shukai Ma School of Software Engineering, South China University of Technology, Guangzhou, China
  • Songwen Gong School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
  • Jiexin Wang School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
  • Ruolin Chen School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
  • Liuwen Cao School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
  • Yi Cai School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China

DOI:

https://doi.org/10.1609/aaai.v40i39.40647

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance in code generation, offering new possibilities for translating natural language into executable programs. To further enhance LLMs’ code generation capabilities, Retrieval-Augmented Generation (RAG) has emerged as a promising strategy by retrieving code examples aligned with the generation intent to guide the process. However, existing RAG-based methods often suffer from unnecessary augmentation, preference misalignment, and surface-level mimicry, which undermine the effectiveness of retrieved examples in guiding LLMs toward accurate code generation. To address these challenges, we propose SRACG, a Selective Retrieval-Augmented Code Generation framework. SRACG begins with a necessity-aware selection mechanism to identify generation intents that genuinely require retrieval support, thereby avoiding degradation from indiscriminate augmentation. For intents identified as needing enhancement, it first employs a multi-objective retrieval strategy to select examples that are semantically aligned with the intent. These candidates are then further filtered by assessing their consistency with the LLM’s inherent generation preferences, ensuring alignment in both style and structure. Finally, it extracts execution plans from the filtered examples to uncover their underlying logic, guiding the LLM to better comprehend the examples instead of merely mimicking surface-level content. Experimental results on widely used benchmarks show that SRACG significantly improves the success rate of LLM-generated code and outperforms existing approaches.

Downloads

Published

2026-03-14

How to Cite

Wang, M., Ma, S., Gong, S., Wang, J., Chen, R., Cao, L., & Cai, Y. (2026). SRACG: A Code Generation Framework with Selective Retrieval Augmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(39), 33584–33592. https://doi.org/10.1609/aaai.v40i39.40647

Issue

Section

AAAI Technical Track on Natural Language Processing IV