Huang, W., Wu, A., Yang, Y., Luo, X., Yang, Y., Naseem, U., … Hu, L. (2026). LLM2CLIP: Powerful Language Model Unlocks Richer Cross-Modality Representation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(7), 5131–5139. https://doi.org/10.1609/aaai.v40i7.37427