ChatGPT-Generated Code Assignment Detection Using Perplexity of Large Language Models (Student Abstract)


  • Zhenyu Xu Texas Tech University
  • Ruoyu Xu Texas Tech University
  • Victor S. Sheng Texas Tech University



Model-generated Code Detection, Chatgpt, Large Language Model


In the era of large language models like Chatgpt, maintaining academic integrity in programming education has become challenging due to potential misuse. There's a pressing need for reliable detectors to identify Chatgpt-generated code. While previous studies have tackled model-generated text detection, identifying such code remains uncharted territory. In this paper, we introduce a novel method to discern Chatgpt-generated code. We employ targeted masking perturbation, emphasizing code sections with high perplexity. Fine-tuned CodeBERT is utilized to replace these masked sections, generating subtly perturbed samples. Our scoring system amalgamates overall perplexity, variations in code line perplexity, and burstiness. In this scoring scheme, a higher rank for the original code suggests it's more likely to be chatgpt-generated. The underlying principle is that code generated by models typically exhibits consistent, low perplexity and reduced burstiness, with its ranking remaining relatively stable even after subtle modifications. In contrast, human-written code, when perturbed, is more likely to produce samples that the model prefers. Our approach significantly outperforms current detectors, especially against OpenAI's text-davinci-003 model, with the average AUC rising from 0.56 (GPTZero baseline) to 0.87.



How to Cite

Xu, Z., Xu, R., & Sheng, V. S. (2024). ChatGPT-Generated Code Assignment Detection Using Perplexity of Large Language Models (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 38(21), 23688-23689.