Recurrent Nested Model for Sequence Generation

Wenhao Jiang; Lin Ma; Wei Lu

doi:10.1609/aaai.v34i07.6768

Authors

Wenhao Jiang Tencent AI Lab
Lin Ma Tencent AI Lab
Wei Lu University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v34i07.6768

Abstract

Depth has been shown beneficial to neural network models. In this paper, we make an attempt to make the encoder-decoder model deeper for sequence generation. We propose a module that can be plugged into the middle between the encoder and decoder to increase the depth of the whole model. The proposed module follows a nested structure, which is divided into blocks with each block containing several recurrent transition steps. To reduce the training difficulty and preserve the necessary information for the decoder during transitions, inter-block connections and intra-block connections are constructed in our model. The inter-block connections provide the thought vectors from the current block to all the subsequent blocks. The intra-block connections connect all the hidden states entering the current block to the current transition step. The advantages of our model are illustrated on the image captioning and code captioning tasks.

Recurrent Nested Model for Sequence Generation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information