Cao, J., Hu, Y., Tan, Z., & Zhao, X. (2025). Cross-modal Multi-task Learning for Multimedia Event Extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 39(11), 11454–11462. https://doi.org/10.1609/aaai.v39i11.33246