RECCHIA, Gabriel; MANGAT, Chatrik Singh; LI, Issac; KRISHNAKUMAR, Gayatri. FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 40, n. 44, p. 37867–37876, 2026. DOI: 10.1609/aaai.v40i44.41123. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/41123. Acesso em: 25 may. 2026.