Labeled Memory Networks for Online Model Adaptation

Authors

  • Shiv Shankar IIT Bombay
  • Sunita Sarawagi IIT Bombay

DOI:

https://doi.org/10.1609/aaai.v32i1.11781

Keywords:

model adaptation , plug and play models , kernel, memory networks, online adaptation

Abstract

Augmenting a neural network with memory that can grow without growing the number of trained parameters is a recent powerful concept with many exciting applications. In this paper, we establish their potential in online adapting a batch trained neural network to domain-relevant labeled data at deployment time. We present the design of Labeled Memory Network (LMN), a new memory augmented neural network (MANN) for fast online model adaptation. We highlight three key features of LMNs. First, LMNs treat memory as a second boosted stage following the trained network thereby allowing the memory and network to play complementary roles. Unlike all existing MANNs that write to memory at every cycle, LMNs provide better memory utilization by writing only labeled data with non-zero loss. Second, LMNs organize the memory with the discrete class label as the primary key unlike existing MANNs where key is a real vector derived from the input. This simple, yet surprisingly unexplored alternative organization, safeguards against catastrophic forgetting of rare labels that current LRU based MANNs are subject to. Finally, LMNs model the evolving expertise of memory and network using a RNN, to determine online their respective weights we evaluate online model adaptation strategies on five sequence prediction tasks, an image classification task, and two language modeling tasks. We show that LMNs are better than other MANNs designed for meta-learning. We also found them to be more accurate and faster than state-of-the-art methods of retuning model parameters for adapting to domain-specific labeled data.

Downloads

Published

2018-04-29

How to Cite

Shankar, S., & Sarawagi, S. (2018). Labeled Memory Networks for Online Model Adaptation. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11781