[1]
Y. Zhan, Y. Yuan, and Z. Xiong, “Mono3DVG: 3D Visual Grounding in Monocular Images”, AAAI, vol. 38, no. 7, pp. 6988-6996, Mar. 2024.