Attention-via-Attention Neural Machine Translation

Shenjian Zhao; Zhihua Zhang

doi:10.1609/aaai.v32i1.11254

Authors

Shenjian Zhao Shanghai Jiao Tong University
Zhihua Zhang Peking University

DOI:

https://doi.org/10.1609/aaai.v32i1.11254

Keywords:

translation, lexical similarity, character level

Abstract

Since many languages originated from a common ancestral language and influence each other, there would inevitably exist similarities between these languages such as lexical similarity and named entity similarity. In this paper, we leverage these similarities to improve the translation performance in neural machine translation. Specifically, we introduce an attention-via-attention mechanism that allows the information of source-side characters flowing to the target side directly. With this mechanism, the target-side characters will be generated based on the representation of source-side characters when the words are similar. For instance, our proposed neural machine translation system learns to transfer the character-level information of the English word "system" through the attention-via-attention mechanism to generate the Czech word "systém." Consequently, our approach is able to not only achieve a competitive translation performance, but also reduce the model size significantly.

Attention-via-Attention Neural Machine Translation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription