Attention-via-Attention Neural Machine Translation


  • Shenjian Zhao Shanghai Jiao Tong University
  • Zhihua Zhang Peking University



translation, lexical similarity, character level


Since many languages originated from a common ancestral language and influence each other, there would inevitably exist similarities between these languages such as lexical similarity and named entity similarity. In this paper, we leverage these similarities to improve the translation performance in neural machine translation. Specifically, we introduce an attention-via-attention mechanism that allows the information of source-side characters flowing to the target side directly. With this mechanism, the target-side characters will be generated based on the representation of source-side characters when the words are similar. For instance, our proposed neural machine translation system learns to transfer the character-level information of the English word "system" through the attention-via-attention mechanism to generate the Czech word "systém." Consequently, our approach is able to not only achieve a competitive translation performance, but also reduce the model size significantly.




How to Cite

Zhao, S., & Zhang, Z. (2018). Attention-via-Attention Neural Machine Translation. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1).