[1]
Hu, X., Yin, X., Lin, K., Zhang, L., Gao, J., Wang, L. and Liu, Z. 2021. VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning. Proceedings of the AAAI Conference on Artificial Intelligence. 35, 2 (May 2021), 1575-1583.