[1]
C.-C. Yang, W.-C. Fan, C.-F. Yang, and Y.-C. F. Wang, “Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation”, AAAI, vol. 36, no. 3, pp. 3036-3044, Jun. 2022.