The Tatort Test of Intelligence: Towards Narrative Comprehension as a Benchmark for AI

Authors

  • Stefan Kramer Johannes Gutenberg University Mainz
  • Lennart Baur Johannes Gutenberg University Mainz
  • Lars Reinhardt Johannes Gutenberg University Mainz

DOI:

https://doi.org/10.1609/aaai.v40i46.41327

Abstract

We propose—somewhat tongue-in-cheek, yet with serious implications—a new test for artificial intelligence: the ability to watch a 90-minute episode of the long-running German crime drama Tatort, and to explain every relevant detail. This involves reconstructing the evolving social network of characters, identifying their beliefs, desires, and intentions, and, crucially, determining who committed the crime. We argue that this task integrates narrative understanding, common-sense reasoning, social cognition, and theory of mind—and thus provides a uniquely challenging benchmark for AI.

Downloads

Published

2026-03-14

How to Cite

Kramer, S., Baur, L., & Reinhardt, L. (2026). The Tatort Test of Intelligence: Towards Narrative Comprehension as a Benchmark for AI. Proceedings of the AAAI Conference on Artificial Intelligence, 40(46), 39722–39727. https://doi.org/10.1609/aaai.v40i46.41327