Wang, N., Deng, J., & Jia, M. (2024). Cycle-Consistency Learning for Captioning and Grounding. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 5535–5543. https://doi.org/10.1609/aaai.v38i6.28363