(1)

Liu, Z.; Liu, J.; Ma, F. Improving Cross-Modal Alignment With Synthetic Pairs for Text-Only Image Captioning. AAAI 2024, 38, 3864-3872.