Li, Xuzhao, Xuchen Li, Shiyu Hu, Yongzhen Guo, and Wentao Zhang. 2026. “VerifyBench: A Systematic Benchmark for Evaluating Reasoning Verifiers Across Domains”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (38):31796-804. https://doi.org/10.1609/aaai.v40i38.40448.