[1]
S. Zhao, “An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos”, AAAI, vol. 34, no. 01, pp. 303-311, Apr. 2020.