Guo, Zixin, Chen Liang, Ziyu Wan, and Yang Bai. “Global Fusion Attention for Vision and Language Understanding (Student Abstract)”. Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 18 (May 18, 2021): 15789–15790. Accessed May 26, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/17891.