Detecting VoIP Data Streams: Approaches Using Hidden Representation Learning


  • Maya Kapoor Parsons Corporation
  • Michael Napolitano Parsons Corporation
  • Jonathan Quance Parsons Corporation
  • Thomas Moyer University of North Carolina at Charlotte
  • Siddharth Krishnan University of North Carolina at Charlotte



Deep Packet Inspection, Network Traffic Analysis, Density-based Clustering, Neural Networks, Network Security


The use of voice-over-IP technology has rapidly expanded over the past several years, and has thus become a significant portion of traffic in the real, complex network environment. Deep packet inspection and middlebox technologies need to analyze call flows in order to perform network management, load-balancing, content monitoring, forensic analysis, and intelligence gathering. Because the session setup and management data can be sent on different ports or out of sync with VoIP call data over the Real-time Transport Protocol (RTP) with low latency, inspection software may miss calls or parts of calls. To solve this problem, we engineered two different deep learning models based on hidden representation learning. MAPLE, a matrix-based encoder which transforms packets into an image representation, uses convolutional neural networks to determine RTP packets from data flow. DATE is a density-analysis based tensor encoder which transforms packet data into a three-dimensional point cloud representation. We then perform density-based clustering over the point clouds as latent representations of the data, and classify packets as RTP or non-RTP based on their statistical clustering features. In this research, we show that these tools may allow a data collection and analysis pipeline to begin detecting and buffering RTP streams for later session association, solving the initial drop problem. MAPLE achieves over ninety-nine percent accuracy in RTP/non-RTP detection. The results of our experiments show that both models can not only classify RTP versus non-RTP packet streams, but could extend to other network traffic classification problems in real deployments of network analysis pipelines.




How to Cite

Kapoor, M., Napolitano, M., Quance, J., Moyer, T., & Krishnan, S. (2023). Detecting VoIP Data Streams: Approaches Using Hidden Representation Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(13), 15519-15527.



IAAI Technical Track on emerging Applications of AI