Wang Z, You H, Li LH, Zareian A, Park S, Liang Y, et al. SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning. AAAI [Internet]. 2022 Jun. 28 [cited 2026 May 12];36(5):5914-22. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/20536