[1]

Wang, Z. et al. 2021. Confidence-aware Non-repetitive Multimodal Transformers for TextCaps. Proceedings of the AAAI Conference on Artificial Intelligence. 35, 4 (May 2021), 2835–2843. DOI:https://doi.org/10.1609/aaai.v35i4.16389.