[1]

Dong, Q. et al. 2021. Listen, Understand and Translate: Triple Supervision Decouples End-to-end Speech-to-text Translation. Proceedings of the AAAI Conference on Artificial Intelligence. 35, 14 (May 2021), 12749–12759. DOI:https://doi.org/10.1609/aaai.v35i14.17509.