Recchia, Gabriel, Chatrik Singh Mangat, Issac Li, and Gayatri Krishnakumar. 2026. “FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (44):37867-76. https://doi.org/10.1609/aaai.v40i44.41123.