The Tatort Test of Intelligence: Towards Narrative Comprehension as a Benchmark for AI

Stefan Kramer; Lennart Baur; Lars Reinhardt

doi:10.1609/aaai.v40i46.41327

The Tatort Test of Intelligence: Towards Narrative Comprehension as a Benchmark for AI

Authors

Stefan Kramer Johannes Gutenberg University Mainz
Lennart Baur Johannes Gutenberg University Mainz
Lars Reinhardt Johannes Gutenberg University Mainz

DOI:

https://doi.org/10.1609/aaai.v40i46.41327

Abstract

We propose—somewhat tongue-in-cheek, yet with serious implications—a new test for artificial intelligence: the ability to watch a 90-minute episode of the long-running German crime drama Tatort, and to explain every relevant detail. This involves reconstructing the evolving social network of characters, identifying their beliefs, desires, and intentions, and, crucially, determining who committed the crime. We argue that this task integrates narrative understanding, common-sense reasoning, social cognition, and theory of mind—and thus provides a uniquely challenging benchmark for AI.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

PDF
Video

Published

2026-03-14

How to Cite

Kramer, S., Baur, L., & Reinhardt, L. (2026). The Tatort Test of Intelligence: Towards Narrative Comprehension as a Benchmark for AI. Proceedings of the AAAI Conference on Artificial Intelligence, 40(46), 39722–39727. https://doi.org/10.1609/aaai.v40i46.41327

Download Citation

Issue

Vol. 40 No. 46: AAAI-26 Special Track AI for Social Impact II and Senior Member Presentations

Section

Senior Member Presentation

The Tatort Test of Intelligence: Towards Narrative Comprehension as a Benchmark for AI

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information