[1]
Z. Wang, “SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning”, AAAI, vol. 36, no. 5, pp. 5914-5922, Jun. 2022.