Recurrent Nested Model for Sequence Generation

Authors

  • Wenhao Jiang Tencent AI Lab
  • Lin Ma Tencent AI Lab
  • Wei Lu University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v34i07.6768

Abstract

Depth has been shown beneficial to neural network models. In this paper, we make an attempt to make the encoder-decoder model deeper for sequence generation. We propose a module that can be plugged into the middle between the encoder and decoder to increase the depth of the whole model. The proposed module follows a nested structure, which is divided into blocks with each block containing several recurrent transition steps. To reduce the training difficulty and preserve the necessary information for the decoder during transitions, inter-block connections and intra-block connections are constructed in our model. The inter-block connections provide the thought vectors from the current block to all the subsequent blocks. The intra-block connections connect all the hidden states entering the current block to the current transition step. The advantages of our model are illustrated on the image captioning and code captioning tasks.

Downloads

Published

2020-04-03

How to Cite

Jiang, W., Ma, L., & Lu, W. (2020). Recurrent Nested Model for Sequence Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 11117-11124. https://doi.org/10.1609/aaai.v34i07.6768

Issue

Section

AAAI Technical Track: Vision