[1]

X. Liu, S. Paul, M. Chatterjee, and A. Cherian, “CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments”, AAAI, vol. 38, no. 4, pp. 3765–3773, Mar. 2024.