Comparing Reward Shaping, Visual Hints, and Curriculum Learning

Authors

  • Rey Pocius Oregon State University
  • David Isele University of Pennsylvania
  • Mark Roberts United States Naval Research Laboratory
  • David Aha United States Naval Research Laboratory

DOI:

https://doi.org/10.1609/aaai.v32i1.12160

Keywords:

Curriculum Learning, Reward Shaping, Task Transfer

Abstract

Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum. Yet few studies examine how they compare to each other, when one might prefer one approach, or how they may complement each other. As a first step in this direction, we compare reward shaping, hints, and curricula for a Deep RL agent in the game of Minecraft. We seek to answer whether reward shaping, visual hints, or the curricula have the most impact on performance, which we measure as the time to reach the target, the distance from the target, the cumulative reward, or the number of actions taken. Our analyses show that performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, the results suggest that designing an effective curriculum and providing appropriate hints most improve the performance. Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum, yet few studies examine how they compare to each other. We compare these approaches for a Deep RL agent in the game of Minecraft and show performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, this suggests that designing an effective curriculum with hints most improve the performance.

Downloads

Published

2018-04-29

How to Cite

Pocius, R., Isele, D., Roberts, M., & Aha, D. (2018). Comparing Reward Shaping, Visual Hints, and Curriculum Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.12160