[1]
N. Wang, J. Deng, and M. Jia, “Cycle-Consistency Learning for Captioning and Grounding”, AAAI, vol. 38, no. 6, pp. 5535-5543, Mar. 2024.