Discovering Interpretable Data-to-Sequence Generators


  • Boris Wiegand SHS - Stahl-Holding-Saar GmbH & Co. KGaA Saarland University
  • Dietrich Klakow Saarland University
  • Jilles Vreeken CISPA Helmholtz Center for Information Security



Data Mining & Knowledge Management (DMKM)


We study the problem of predicting an event sequence given some meta data. In particular, we are interested in learning easily interpretable models that can accurately generate a sequence based on an attribute vector. To this end, we propose to learn a sparse event-flow graph over the training sequences, and statistically robust rules that use meta data to determine which paths to follow. We formalize the problem in terms of the Minimum Description Length (MDL) principle, by which we identify the best model as the one that compresses the data best. As the resulting optimization problem is NP-hard, we propose the efficient ConSequence algorithm to discover good event-flow graphs from data. Through an extensive set of experiments including a case study, we show that it ably discovers compact, interpretable and accurate models for the generation and prediction of event sequences from data, has a low sample complexity, and is particularly robust against noise.




How to Cite

Wiegand, B., Klakow, D., & Vreeken, J. (2022). Discovering Interpretable Data-to-Sequence Generators. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4), 4237-4244.



AAAI Technical Track on Data Mining and Knowledge Management