DGV: Fusing Dynamic Graphs and Vision-Language Models for Collaborative Dual-Arm Task Planning

Yapeng Pang; Junjie Xu; Zhidong Qiao; Peng Du; Xinyu Zhang

doi:10.1609/icaps.v36i1.42895

Authors

Yapeng Pang East China Normal University
Junjie Xu East China Normal University
Zhidong Qiao Harbin Institute of Technology
Peng Du Zhejiang University
Xinyu Zhang East China Normal University

DOI:

https://doi.org/10.1609/icaps.v36i1.42895

Abstract

Dual-arm collaborative manipulation in dynamic, unstructured environments is profoundly challenging, requiring real-time handling of high-dimensional physical constraints alongside dynamic scene understanding and adaptation to high-level natural language instructions. To address these challenges, we propose the Dynamic Graph Vision-Language Model (DGV), a novel dynamic task planning framework that seamlessly integrates GNNs and VLMs. It first leverages a pre-trained VLM to integrate perceptual and semantic processing, accurately extracting object states and complex manipulation intents from the environment. This extracted information is then encoded into a dynamic spatio-temporal graph that models the robot's kinematic structure, environmental object relations, and temporal dependencies within a single, unified representation. We propose a real-time local subgraph update mechanism, which is designed to cope with rapid environmental changes. This mechanism ensures immediate action adjustments and efficient replanning based on fresh visual feedback, dramatically improving dynamic adaptability. Utilizing the updated graph structure, DGV performs efficient reasoning to generate continuous, stable, and robust dual-arm collaborative motion sequences. Our experimental results across both simulation and real-world robot platforms demonstrate that DGV achieves a task success rate nearly 20% higher than current state-of-the-art methods, while exhibiting superior performance in dynamic adaptability and robustness.

DGV: Fusing Dynamic Graphs and Vision-Language Models for Collaborative Dual-Arm Task Planning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information