Wu, R. (2026) “HiMo-CLIP: Modeling Semantic Hierarchy and Monotonicity in Vision-Language Alignment”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), pp. 26974–26982. doi: 10.1609/aaai.v40i32.39910.