[1]
Wang, Z., You, H., Li, L.H., Zareian, A., Park, S., Liang, Y., Chang, K.-W. and Chang, S.-F. 2022. SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning. Proceedings of the AAAI Conference on Artificial Intelligence. 36, 5 (Jun. 2022), 5914-5922. DOI:https://doi.org/10.1609/aaai.v36i5.20536.