[1]
Y. Zhang, “LLaVA-UHD v2: Exploiting Hierarchical Vision Granularity in MLLMs via Inverse Semantic Pyramid”, AAAI, vol. 40, no. 15, pp. 12934–12942, Mar. 2026.