Liu, Yongfei, Bo Wan, Xiaodan Zhu, and Xuming He. “Learning Cross-Modal Context Graph for Visual Grounding”. Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11645-11652. Accessed May 22, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/6833.