The Logic of Benchmarking: A Case Against State-of-the-Art Performance

Wheeler Ruml

doi:10.1609/socs.v1i1.18179

Authors

Wheeler Ruml University of New Hampshire

DOI:

https://doi.org/10.1609/socs.v1i1.18179

Keywords:

heuristic search, empirical methods, benchmarking, methodology

Abstract

This note marshals arguments for three points. First, it is better to test on small benchmark instances than to solve the largest possible ones. This eases replication and allows a more diverse set of instances to be tested. There are few conclusions that one can draw from running on large benchmarks that can't also be drawn from running on small benchmarks. Second, experimental evaluation should focus on understanding algorithm behavior and forming predictive models, rather than on achieving state-of-the-art performance on toy problems. Third, it is more important to develop search techniques that are robust across multiple domains than ones that only give state-of-the-art performance in a single domain. Robust techniques are more likely be useful to others.

The Logic of Benchmarking: A Case Against State-of-the-Art Performance

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section