(1)
Shamsolmoali, P.; Zareapoor, M.; Granger, E.; Felsberg, M. SeTformer Is What You Need for Vision and Language. AAAI 2024, 38, 4713-4721.