Jiang, Dongwei, Jingyu Zhang, Orion Weller, Nathaniel Weir, Benjamin Van Durme, and Daniel Khashabi. “SELF-[IN]CORRECT: LLMs Struggle With Discriminating Self-Generated Responses”. Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 23 (April 11, 2025): 24266–24275. Accessed May 10, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/34603.