Li, Yuzhen, et al. “Mono3DVG-EnSD: Enhanced Spatial-Aware and Dimension-Decoupled Text Encoding for Monocular 3D Visual Grounding”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 8, Mar. 2026, pp. 6726-34, doi:10.1609/aaai.v40i8.37604.