Transfer Reinforcement Learning Using Output-Gated Working Memory
Transfer learning allows for knowledge to generalize across tasks, resulting in increased learning speed and/or performance. These tasks must have commonalities that allow for knowledge to be transferred. The main goal of transfer learning in the reinforcement learning domain is to train and learn on one or more source tasks in order to learn a target task that exhibits better performance than if transfer was not used (Taylor and Stone 2009). Furthermore, the use of output-gated neural network models of working memory has been shown to increase generalization for supervised learning tasks (Kriete and Noelle 2011; Kriete et al. 2013). We propose that working memory-based generalization plays a significant role in a model's ability to transfer knowledge successfully across tasks. Thus, we extended the Holographic Working Memory Toolkit (HWMtk) (Dubois and Phillips 2017; Phillips and Noelle 2005) to utilize the generalization benefits of output gating within a working memory system. Finally, the model's utility was tested on a temporally extended, partially observable 5x5 2D grid-world maze task that required the agent to learn 3 tasks over the duration of the training period. The results indicate that the addition of output gating increases the initial learning performance of an agent in target tasks and decreases the learning time required to reach a fixed performance threshold.