Zhang, Hui, et al. “LLaVA-MS-PIT: Multi-Modal Schema-Guided Progressive Instruction Tuning for Multi-Modal Event Extraction”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 41, Mar. 2026, pp. 34692-00, doi:10.1609/aaai.v40i41.40770.