[1]

G. Meng, “EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models”, AAAI, vol. 39, no. 6, pp. 6126–6134, Apr. 2025.