[1]
Xuan, H., Zhang, Z., Chen, S., Yang, J. and Yan, Y. 2020. Cross-Modal Attention Network for Temporal Inconsistent Audio-Visual Event Localization. Proceedings of the AAAI Conference on Artificial Intelligence. 34, 01 (Apr. 2020), 279-286. DOI:https://doi.org/10.1609/aaai.v34i01.5361.