Islam, Md Mofijul, Alexi Gladstone, and Tariq Iqbal. “PATRON: Perspective-Aware Multitask Model for Referring Expression Grounding Using Embodied Multimodal Cues”. Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 1 (June 26, 2023): 971–979. Accessed May 26, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/25177.