Li, L. (2026). Evaluating the Architectural Reasoning Capabilities of LLM Provers via the Obfuscated Natural Number Game. Proceedings of the AAAI Symposium Series, 8(1), 588–591. https://doi.org/10.1609/aaaiss.v8i1.42591