Toshniwal, Shubham, Sam Wiseman, Karen Livescu, and Kevin Gimpel. 2022. “Chess As a Testbed for Language Model State Tracking”. Proceedings of the AAAI Conference on Artificial Intelligence 36 (10):11385-93. https://doi.org/10.1609/aaai.v36i10.21390.