Dong, Qianqian, et al. “Listen, Understand and Translate: Triple Supervision Decouples End-to-End Speech-to-Text Translation”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 14, May 2021, pp. 12749-5, doi:10.1609/aaai.v35i14.17509.