SRACG: A Code Generation Framework with Selective Retrieval Augmentation

Mengzhen Wang; Shukai Ma; Songwen Gong; Jiexin Wang; Ruolin Chen; Liuwen Cao; Yi Cai

doi:10.1609/aaai.v40i39.40647

Authors

Mengzhen Wang School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
Shukai Ma School of Software Engineering, South China University of Technology, Guangzhou, China
Songwen Gong School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
Jiexin Wang School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
Ruolin Chen School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
Liuwen Cao School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China
Yi Cai School of Software Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (SCUT), Ministry of Education of China

DOI:

https://doi.org/10.1609/aaai.v40i39.40647

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance in code generation, offering new possibilities for translating natural language into executable programs. To further enhance LLMs’ code generation capabilities, Retrieval-Augmented Generation (RAG) has emerged as a promising strategy by retrieving code examples aligned with the generation intent to guide the process. However, existing RAG-based methods often suffer from unnecessary augmentation, preference misalignment, and surface-level mimicry, which undermine the effectiveness of retrieved examples in guiding LLMs toward accurate code generation. To address these challenges, we propose SRACG, a Selective Retrieval-Augmented Code Generation framework. SRACG begins with a necessity-aware selection mechanism to identify generation intents that genuinely require retrieval support, thereby avoiding degradation from indiscriminate augmentation. For intents identified as needing enhancement, it first employs a multi-objective retrieval strategy to select examples that are semantically aligned with the intent. These candidates are then further filtered by assessing their consistency with the LLM’s inherent generation preferences, ensuring alignment in both style and structure. Finally, it extracts execution plans from the filtered examples to uncover their underlying logic, guiding the LLM to better comprehend the examples instead of merely mimicking surface-level content. Experimental results on widely used benchmarks show that SRACG significantly improves the success rate of LLM-generated code and outperforms existing approaches.

SRACG: A Code Generation Framework with Selective Retrieval Augmentation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information