ManiLong-Shot: Interaction-Aware One-Shot Imitation Learning for Long-Horizon Manipulation

Zixuan Chen; Chongkai Gao; Lin Shao; Jieqi Shi; Jing Huo; Yang Gao

doi:10.1609/aaai.v40i22.38881

Authors

Zixuan Chen Nanjing University, Nanjing, China
Chongkai Gao National University of Singapore, Singapore
Lin Shao National University of Singapore, Singapore
Jieqi Shi Nanjing University, Suzhou, China
Jing Huo Nanjing University, Nanjing, China
Yang Gao YiLi Normal University, Xinjiang, China Nanjing University, Nanjing, China Nanjing University, Suzhou, China

DOI:

https://doi.org/10.1609/aaai.v40i22.38881

Abstract

One-shot imitation learning (OSIL) offers a promising way to teach robots new skills without large-scale data collection. However, current OSIL methods are primarily limited to short-horizon tasks, thus limiting their applicability to complex, long-horizon manipulations. To address this limitation, we propose ManiLong-Shot, a novel framework that enables effective OSIL for long-horizon prehensile manipulation tasks. ManiLong-Shot structures long-horizon tasks around physical interaction events, reframing the problem as sequencing interaction-aware primitives instead of directly imitating continuous trajectories. This primitive decomposition can be driven by high-level reasoning from a vision-language model (VLM) or by rule-based heuristics derived from robot state changes. For each primitive, ManiLong-Shot predicts invariant regions critical to the interaction, establishes correspondences between the demonstration and the current observation, and computes the target end-effector pose, enabling effective task execution. Extensive simulation experiments show that ManiLong-Shot, trained on only 10 short-horizon tasks, generalizes to 20 unseen long-horizon tasks across three difficulty levels via one-shot imitation, achieving a 22.8% relative improvement over the SOTA. Additionally, real-robot experiments validate ManiLong-Shot’s ability to robustly execute three long-horizon manipulation tasks via OSIL, confirming its practical applicability.

ManiLong-Shot: Interaction-Aware One-Shot Imitation Learning for Long-Horizon Manipulation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information