[1]
J. Li, “CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models”, AAAI, vol. 40, no. 8, pp. 6244–6252, Mar. 2026.