[1]
C. Jiang, “TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training”, AAAI, vol. 38, no. 3, pp. 2489-2497, Mar. 2024.