[1]

Sun, Z., Sarma, P., Sethares, W. and Liang, Y. 2020. Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis. Proceedings of the AAAI Conference on Artificial Intelligence. 34, 05 (Apr. 2020), 8992-8999. DOI:https://doi.org/10.1609/aaai.v34i05.6431.