Wang, Z., Bao, R., Wu, Q., & Liu, S. (2021). Confidence-aware Non-repetitive Multimodal Transformers for TextCaps. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 2835-2843. https://doi.org/10.1609/aaai.v35i4.16389