[1]
L. Huang, S.- hua Zhong, Z. Zhang, and Y. Liu, “From Pixels to Logic: A Perception-Reasoning Decomposition Framework for Open-World Referring Expression Comprehension”, AAAI, vol. 40, no. 7, pp. 5058–5066, Mar. 2026.