Pragmatic Code Autocomplete

Authors

  • Gabriel Poesia Stanford University
  • Noah Goodman Stanford University

Keywords:

Software Engineering, Language Models

Abstract

Human language is ambiguous, with intended meanings recovered via pragmatic reasoning in context. Such reliance on context is essential for the efficiency of human communication. Programming languages, in stark contrast, are defined by unambiguous grammars. In this work, we aim to make programming languages more concise by allowing programmers to utilize a controlled level of ambiguity. Specifically, we allow single-character abbreviations for common keywords and identifiers. Our system first proposes a set of strings that can be abbreviated by the user. Using only 100 abbreviations, we observe that a large dataset of Python code can be compressed by 15%, a number that can be improved even further by specializing the abbreviations to a particular code base. We then use a contextualized sequence-to-sequence model to rank potential expansions of inputs that include abbreviations. In an offline reconstruction task our model achieves accuracies ranging from 93% to 99%, depending on the programming language and user settings. The model is small enough to run on a commodity CPU in real-time. We evaluate the usability of our system in a user study, integrating it in Microsoft VSCode, a popular code text editor. We observe that our system performs well and is complementary to traditional autocomplete features.

Downloads

Published

2021-05-18

How to Cite

Poesia, G., & Goodman, N. (2021). Pragmatic Code Autocomplete. Proceedings of the AAAI Conference on Artificial Intelligence, 35(1), 445-452. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16121

Issue

Section

AAAI Technical Track on Application Domains