Guo, Z., Xu, B., Zhu, C., Hong, W., Wang, X., & Mao, Z. (2026). MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools. Proceedings of the AAAI Conference on Artificial Intelligence, 40(37), 30888–30896. https://doi.org/10.1609/aaai.v40i37.40347