Enhancing Human-AI Trust in Cyber Threat Intelligence via Interpretable Attack Phase Classification

Muhammad Saad Rashad; Mousa Al-Kfairy; Muhammad Amin; Hafeez Anwar; Waqas Ali; Sajid Anwar

doi:10.1609/aaaiss.v9i1.42911

Authors

Muhammad Saad Rashad National University of Computer & Emerging Sciences
Mousa Al-Kfairy Zayed University
Muhammad Amin National University of Computer & Emerging Sciences
Hafeez Anwar National University of Computer & Emerging Sciences
Waqas Ali National University of Computer & Emerging Sciences
Sajid Anwar Institute of Management Sciences

DOI:

https://doi.org/10.1609/aaaiss.v9i1.42911

Abstract

The classification of cyber threat intelligence indicators into attack phases is essential for effective threat analysis and automated defense systems. However, such indicators are often short, sparse, and highly imbalanced, limiting the effectiveness of sequence-based deep learning approaches which is essential for establishing Human-AI trust in operational cybersecurity settings. In this work, we propose a hybrid classification framework that combines TF-IDF representations with dimensionality reduction and domain-specific binary features capturing structural properties of cyber indicators. Classical machine learning models, including Support Vector Machines, Decision Trees, and Logistic Regression, are evaluated and compared with an LSTM-based sequence models. Experimental results on a STIX-based dataset demonstrate that classical models consistently outperform the LSTM baseline, with the SVM achieving the highest macro-F1 score of 0.924 on the held-out test set. Cross-validation further confirms the robustness of the proposed approach, with only marginal variation across models. These findings highlight the effectiveness of sparse lexical representations augmented with domain knowledge for cyber threat indicator classification. Beyond predictive performance this study incorporates explainable analysis using SHAP to provide transparent insights into feature contributions across attack phases, supporting analyst trust and informed decision making. These results demonstrate that sparse lexical representations augmented with domain knowledge offer an efficient, interpretable, and trustworthy solution for attack-phase classification in operational CTI environments.

Enhancing Human-AI Trust in Cyber Threat Intelligence via Interpretable Attack Phase Classification

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information