(1)
Xu, Z.; Ding, J.; Lou, Y.; Zhang, K.; Gong, D.; Li, Y. Socrates or Smartypants: Testing Logic Reasoning Capabilities of Large Language Models With Logic Programming-Based Test Oracles. AAAI 2026, 40, 19433-19440.