InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct
DOI:
https://doi.org/10.1609/aaai.v39i24.34742Abstract
Recent advancements in open-source code large language models (LLMs) have been driven by fine-tuning on the data generated from powerful closed-source LLMs, which are expensive to obtain. This paper explores whether it is possible to use a fine-tuned open-source model to generate additional data to augment its instruction-tuning dataset. We make two observations: (1) A code snippet can serve as the response to different instructions. (2) Instruction-tuned code LLMs perform better at translating code into instructions than the reverse. Based on these observations, we propose Inverse-Instruct, a data augmentation technique that uses a fine-tuned LLM to generate additional instructions of code responses from its own training dataset. The additional instruction-response pairs are added to the original dataset, and a stronger code LLM can be obtained by fine-tuning on the augmented dataset. We empirically validate Inverse-Instruct on a range of open-source code models (e.g. CodeLlama-Python and DeepSeek-Coder) and benchmarks (e.g., HumanEval(+), MBPP(+), DS-1000 and MultiPL-E), showing it consistently improves the base models.Published
2025-04-11
How to Cite
Wu, Y., Huang, D., Shi, W., Wang, W., Pu, Y., Gao, L., … Chen, Y. (2025). InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct. Proceedings of the AAAI Conference on Artificial Intelligence, 39(24), 25525–25533. https://doi.org/10.1609/aaai.v39i24.34742
Issue
Section
AAAI Technical Track on Natural Language Processing III