Automatic Segmentation of Data Sequences


  • Liangzhe Chen Virginia tech
  • Sorour E. Amiri Virginia Tech
  • B. Aditya Prakash Virginia Tech


Segmenting temporal data sequences is an important problem which helps in understanding data dynamics in multiple applications such as epidemic surveillance, motion capture sequences, etc. In this paper, we give DASSA, the first self-guided and efficient algorithm to automatically find a segmentation that best detects the change of pattern in data sequences. To avoid introducing tuning parameters, we design DASSA to be a multi-level method which examines segments at each level of granularity via a compact data structure called the segment-graph. We build this data structure by carefully leveraging the information bottleneck method with the MDL principle to effectively represent each segment.Next, DASSA efficiently finds the optimal segmentation via a novel average-longest-path optimization on the segment-graph. Finally we show how the outputs from DASSA can be naturally interpreted to reveal meaningful patterns. We ran DASSA on multiple real datasets of varying sizes and it is very effective in finding the time-cut points of the segmentations (in some cases recovering the cut points perfectly) as well as in finding the corresponding changing patterns.




How to Cite

Chen, L., Amiri, S. E., & Prakash, B. A. (2018). Automatic Segmentation of Data Sequences. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). Retrieved from