Schumann, Raphael, Wanrong Zhu, Weixi Feng, Tsu-Jui Fu, Stefan Riezler, and William Yang Wang. 2024. “VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View”. Proceedings of the AAAI Conference on Artificial Intelligence 38 (17):18924-33. https://doi.org/10.1609/aaai.v38i17.29858.