Zhao, S., Ma, Y., Gu, Y., Yang, J., Xing, T., Xu, P., Hu, R., Chai, H., & Keutzer, K. (2020). An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos. Proceedings of the AAAI Conference on Artificial Intelligence, 34(01), 303-311. https://doi.org/10.1609/aaai.v34i01.5364