Comparing Human Behavior to an Optimal Policy for Innovation

Bonan Zhao; Natalia Vélez; Thomas L. Griffiths

doi:10.1609/aaaiss.v3i1.31291

Authors

Bonan Zhao Princeton University
Natalia Vélez Princeton University
Thomas L. Griffiths Princeton University

DOI:

https://doi.org/10.1609/aaaiss.v3i1.31291

Keywords:

Innovation, Discovery, Explore-exploit, Decision Making, Optimal Stopping

Abstract

Human learning does not stop at solving a single problem. Instead, we seek new challenges, define new goals, and come up with new ideas. Unlike the classic explore-exploit trade-off between known and unknown options, making new tools or generating new ideas is not about collecting data from existing unknown options, but rather about create new options out of what is currently available. We introduce a discovery game designed to study how rational agents make decisions about pursuing innovations, where discovering new ideas is a process of combining existing ideas in an open-ended compositional space. We derive optimal policies of this decision problem formalized as a Markov decision process, and compare people's behaviors to the model predictions in an online behavioral experiment. We found evidence that people both innovate rationally, guided by potential returns in this discovery game, and under- and over-explore systematically in different settings.

Comparing Human Behavior to an Optimal Policy for Innovation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Information