Wu, R., Chen, P., Shen, F., Zhao, S., Hui, Q., Gao, H., … Lian, S. (2026). HiMo-CLIP: Modeling Semantic Hierarchy and Monotonicity in Vision-Language Alignment. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 26974–26982. https://doi.org/10.1609/aaai.v40i32.39910