Shamsolmoali, P. (2024) “SeTformer Is What You Need for Vision and Language”, Proceedings of the AAAI Conference on Artificial Intelligence, 38(5), pp. 4713–4721. doi: 10.1609/aaai.v38i5.28272.