Dong, Qianqian, Rong Ye, Mingxuan Wang, Hao Zhou, Shuang Xu, Bo Xu, and Lei Li. 2021. “Listen, Understand and Translate: Triple Supervision Decouples End-to-End Speech-to-Text Translation”. Proceedings of the AAAI Conference on Artificial Intelligence 35 (14):12749-59. https://doi.org/10.1609/aaai.v35i14.17509.