Detecting and Tracking Concept Class Drift and Emergence in Non-Stationary Fast Data Streams

Authors

  • Brandon Parker University of Texas at Dallas
  • Latifur Khan University of Texas at Dallas

DOI:

https://doi.org/10.1609/aaai.v29i1.9588

Keywords:

Fast Data, Novel class detection, Non-Stationary stream classification, semi-supervised learning, stream clustering

Abstract

As the proliferation of constant data feeds increases from social media, embedded sensors, and other sources, the capability to provide predictive concept labels to these data streams will become ever more important and lucrative. However, the dynamic, non-stationary nature, and effectively infinite length of data streams pose additional challenges for stream data mining algorithms. The sparse quantity of training data also limits the use of algorithms that are heavily dependent on supervised training. To address all these issues, we propose an incremental semi-supervised method that provides general concept class label predictions, but it also tracks concept clusters within the feature space using an innovative new online clustering algorithm. Each concept cluster contains an embedded stream classifier, creating a diverse ensemble for data instance classification within the generative model used for detecting emerging concepts in the stream. Unlike other recent novel class detection methods, our method goes beyond detecting, and continues to differentiate and track the emerging concepts. We show the effectiveness of our method on several synthetic and real world data sets, and we compare the results against other leading baseline methods.

Downloads

Published

2015-02-21

How to Cite

Parker, B., & Khan, L. (2015). Detecting and Tracking Concept Class Drift and Emergence in Non-Stationary Fast Data Streams. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9588

Issue

Section

Main Track: Novel Machine Learning Algorithms