Associating Natural Language Comment and Source Code Entities

Authors

  • Sheena Panthaplackel The University of Texas at Austin
  • Milos Gligoric The University of Texas at Austin
  • Raymond J. Mooney The University of Texas at Austin
  • Junyi Jessy Li The University of Texas at Austin

DOI:

https://doi.org/10.1609/aaai.v34i05.6382

Abstract

Comments are an integral part of software development; they are natural language descriptions associated with source code elements. Understanding explicit associations can be useful in improving code comprehensibility and maintaining the consistency between code and comments. As an initial step towards this larger goal, we address the task of associating entities in Javadoc comments with elements in Java source code. We propose an approach for automatically extracting supervised data using revision histories of open source projects and present a manually annotated evaluation dataset for this task. We develop a binary classifier and a sequence labeling model by crafting a rich feature set which encompasses various aspects of code, comments, and the relationships between them. Experiments show that our systems outperform several baselines learning from the proposed supervision.

Downloads

Published

2020-04-03

How to Cite

Panthaplackel, S., Gligoric, M., Mooney, R. J., & Li, J. J. (2020). Associating Natural Language Comment and Source Code Entities. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 8592-8599. https://doi.org/10.1609/aaai.v34i05.6382

Issue

Section

AAAI Technical Track: Natural Language Processing