Xuan, Hanyu, Zhenyu Zhang, Shuo Chen, Jian Yang, and Yan Yan. 2020. “Cross-Modal Attention Network for Temporal Inconsistent Audio-Visual Event Localization”. Proceedings of the AAAI Conference on Artificial Intelligence 34 (01):279-86. https://doi.org/10.1609/aaai.v34i01.5361.