Tian, C. (2026) “Rectify Evaluation Preference: Improving LLMs’ Critique on Math Reasoning via Perplexity-aware Reinforcement Learning”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(39), pp. 33241–33249. doi: 10.1609/aaai.v40i39.40609.