VISION: Robust and Interpretable Code Vulnerability Detection Leveraging Counterfactual Augmentation

David Egea; Barproda Halder; Sanghamitra Dutta

doi:10.1609/aies.v8i1.36592

Authors

David Egea Universidad Pontificia Comillas University of Maryland
Barproda Halder University of Maryland
Sanghamitra Dutta University of Maryland

DOI:

https://doi.org/10.1609/aies.v8i1.36592

Abstract

Automated detection of vulnerabilities in source code is an essential cybersecurity challenge, underpinning trust in digital systems and services. Graph Neural Networks (GNNs) have emerged as a promising approach as they can learn the structural and logical code relationships in a data-driven manner. However, the performance of GNNs is severely limited by training data imbalances and label noise. GNNs can often learn “spurious” correlations due to superficial code similarities in the training data, leading to detectors that do not generalize well to unseen real-world data. In this work, we propose a new unified framework for robust and interpretable vulnerability detection—that we call VISION—to mitigate spurious correlations by systematically augmenting a counterfactual training dataset. Counterfactuals are samples with minimal semantic modifications that have opposite prediction labels. Our complete framework includes: (i) generating effective counterfactuals by prompting a Large Language Model (LLM); (ii) targeted GNN model training on synthetically paired code examples with opposite labels; and (iii) graph-based interpretability to identify the truly crucial code statements relevant for vulnerability predictions while ignoring the spurious ones. We find that our framework reduces spurious learning and enables more robust and generalizable vulnerability detection, as demonstrated by improvements in overall accuracy (from 51.8% to 97.8%), pairwise contrast accuracy (from 4.5% to 95.8%), and worst-group accuracy increasing (from 0.7% to 85.5%) on the widely popular Common Weakness Enumeration (CWE)-20 vulnerability. We also demonstrate improvements using our proposed metrics, namely, intra-class attribution variance, inter-class attribution distance, and node score dependency. We provide a new benchmark for vulnerability detection, CWE-20-CFA, comprising 27,556 samples from functions affected by the high-impact and frequently occurring CWE-20 vulnerability, including both real and counterfactual examples. Furthermore, our approach enhances societal objectives of transparent and trustworthy AI-based cybersecurity systems through interactive visualization for human-in-the-loop analysis.

VISION: Robust and Interpretable Code Vulnerability Detection Leveraging Counterfactual Augmentation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section