Liu, Zhiyue, Jinyuan Liu, and Fanrong Ma. 2024. “Improving Cross-Modal Alignment With Synthetic Pairs for Text-Only Image Captioning”. Proceedings of the AAAI Conference on Artificial Intelligence 38 (4):3864-72. https://doi.org/10.1609/aaai.v38i4.28178.