Recchia, G. (2026) “FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), pp. 37867–37876. doi: 10.1609/aaai.v40i44.41123.