Hanks, S., Pollack, M. E., & Cohen, P. R. (1993). Benchmarks, Test Beds, Controlled Experimentation, and the Design of Agent Architectures. AI Magazine, 14(4), 17. https://doi.org/10.1609/aimag.v14i4.1066