Dou, Z.-Y., Z. Tu, X. Wang, L. Wang, S. Shi, and T. Zhang. “Dynamic Layer Aggregation for Neural Machine Translation With Routing-by-Agreement”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, July 2019, pp. 86-93, doi:10.1609/aaai.v33i01.330186.