[1]

Q. Chen, S. Di, and W. Xie, “Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos”, AAAI, vol. 39, no. 2, pp. 2159–2167, Apr. 2025.