Dynamic Malware Analysis with Feature Engineering and Feature Learning

Authors

  • Zhaoqi Zhang National University of Singapore
  • Panpan Qi National University of Singapore
  • Wei Wang National University of Singapore

DOI:

https://doi.org/10.1609/aaai.v34i01.5474

Abstract

Dynamic malware analysis executes the program in an isolated environment and monitors its run-time behaviour (e.g. system API calls) for malware detection. This technique has been proven to be effective against various code obfuscation techniques and newly released (“zero-day”) malware. However, existing works typically only consider the API name while ignoring the arguments, or require complex feature engineering operations and expert knowledge to process the arguments. In this paper, we propose a novel and low-cost feature extraction approach, and an effective deep neural network architecture for accurate and fast malware detection. Specifically, the feature representation approach utilizes a feature hashing trick to encode the API call arguments associated with the API name. The deep neural network architecture applies multiple Gated-CNNs (convolutional neural networks) to transform the extracted features of each API call. The outputs are further processed through bidirectional LSTM (long-short term memory networks) to learn the sequential correlation among API calls. Experiments show that our solution outperforms baselines significantly on a large real dataset. Valuable insights about feature engineering and architecture design are derived from the ablation study.

Downloads

Published

2020-04-03

How to Cite

Zhang, Z., Qi, P., & Wang, W. (2020). Dynamic Malware Analysis with Feature Engineering and Feature Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 34(01), 1210-1217. https://doi.org/10.1609/aaai.v34i01.5474

Issue

Section

AAAI Technical Track: Applications