[1]
T. Pham, “Truth Behind the Scene: Designing Evaluations Benchmarks to Assess LLMs’ Task-Specific Understanding over Test-Taking Strategies”, AAAI, vol. 39, no. 28, pp. 29596-29598, Apr. 2025.