Dong, Q. (2021) “Listen, Understand and Translate: Triple Supervision Decouples End-to-end Speech-to-text Translation”, Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), pp. 12749–12759. doi: 10.1609/aaai.v35i14.17509.