Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration

Zhixuan Shen; Haonan Luo; Kexun Chen; Fengmao Lv; Tianrui Li

doi:10.1609/aaai.v39i14.33607

Authors

Zhixuan Shen School of Computing and Artificial Intelligence, Southwest Jiaotong University, China
Haonan Luo School of Computing and Artificial Intelligence, Southwest Jiaotong University, China
Kexun Chen School of Computing and Artificial Intelligence, Southwest Jiaotong University, China
Fengmao Lv School of Computing and Artificial Intelligence, Southwest Jiaotong University, China
Tianrui Li School of Computing and Artificial Intelligence, Southwest Jiaotong University, China

DOI:

https://doi.org/10.1609/aaai.v39i14.33607

Abstract

Understanding how humans cooperatively utilize semantic knowledge to explore unfamiliar environments and decide on navigation directions is critical for house service multi-robot systems. Previous methods primarily focused on single-robot centralized planning strategies, which severely limited exploration efficiency. Recent research has considered decentralized planning strategies for multiple robots, assigning separate planning models to each robot, but these approaches often overlook communication costs. In this work, we propose Multimodal Chain-of-Thought Co-Navigation (MCoCoNav), a modular approach that utilizes multimodal Chain-of-Thought to plan collaborative semantic navigation for multiple robots. MCoCoNav combines visual perception with Vision Language Models (VLMs) to evaluate exploration value through probabilistic scoring, thus reducing time costs and achieving stable outputs. Additionally, a global semantic map is used as a communication bridge, minimizing communication overhead while integrating observational results. Guided by scores that reflect exploration trends, robots utilize this map to assess whether to explore new frontier points or revisit history nodes. Experiments on HM3D_v0.2 and MP3D demonstrate the effectiveness of our approach.

Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information