DONG, Q.; YE, R.; WANG, M.; ZHOU, H.; XU, S.; XU, B.; LI, L. Listen, Understand and Translate: Triple Supervision Decouples End-to-end Speech-to-text Translation. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 35, n. 14, p. 12749-12759, 2021. DOI: 10.1609/aaai.v35i14.17509. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/17509. Acesso em: 20 apr. 2024.