Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

Authors

  • Ting Long Shanghai Jiao Tong University
  • Yutong Xie University of Michigan
  • Xianyu Chen Shanghai Jiao Tong University
  • Weinan Zhang Shanghai Jiao Tong University
  • Qinxiang Cao Shanghai Jiao Tong University
  • Yong Yu Shanghai Jiao Tong University

DOI:

https://doi.org/10.1609/aaai.v36i5.20522

Keywords:

Knowledge Representation And Reasoning (KRR)

Abstract

Program representation, which aims at converting program source code into vectors with automatically extracted features, is a fundamental problem in programming language processing (PLP). Recent work tries to represent programs with neural networks based on source code structures. However, such methods often focus on the syntax and consider only one single perspective of programs, limiting the representation power of models. This paper proposes a multi-view graph (MVG) program representation method. MVG pays more attention to code semantics and simultaneously includes both data flow and control flow as multiple views. These views are then combined and processed by a graph neural network (GNN) to obtain a comprehensive program representation that covers various aspects. We thoroughly evaluate our proposed MVG approach in the context of algorithm detection, an important and challenging subfield of PLP. Specifically, we use a public dataset POJ-104 and also construct a new challenging dataset ALG-109 to test our method. In experiments, MVG outperforms previous methods significantly, demonstrating our model's strong capability of representing source code.

Downloads

Published

2022-06-28

How to Cite

Long, T., Xie, Y., Chen, X., Zhang, W., Cao, Q., & Yu, Y. (2022). Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 36(5), 5792-5799. https://doi.org/10.1609/aaai.v36i5.20522

Issue

Section

AAAI Technical Track on Knowledge Representation and Reasoning