SPARD: Single-step Inference with Adaptive Sampling in Residual Diffusion for Human Motion Prediction

Yiming Zhang; Baojia Han; Ximing Li; Wei Pang; Fausto Giunchiglia; Xiaoyue Feng; Renchu Guan

doi:10.1609/aaai.v40i21.38865

Authors

Yiming Zhang Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University
Baojia Han Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University
Ximing Li Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University RIKEN Center for Advanced Intelligence Project, Japan
Wei Pang School of Mathematical and Computer Sciences, Heriot-Watt University
Fausto Giunchiglia Department of Information Engineering and Computer Science, University of Trento
Xiaoyue Feng Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University
Renchu Guan Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University

DOI:

https://doi.org/10.1609/aaai.v40i21.38865

Abstract

The task of stochastic human motion prediction has attracted significant attention in recent years due to its wide-ranging applications in robotics, animation, and human-computer interaction. While diffusion models have demonstrated promising progress in this domain, they remain hindered by two critical limitations: (1) slow inference speeds due to their reliance on iterative sampling, and (2) performance degradation resulting from suboptimal sample allocation during generation. To overcome these challenges, we propose SPARD (Single-step Inference with Adaptive Sampling in Residual Diffusion for Human Motion Prediction), a novel framework that achieves efficient single-step inference while maintaining high predictive accuracy. Furthermore, we introduce a novel adaptive noise predictor module that dynamically samples latent representations based on observed motion sequences, ensuring both accuracy and plausibility in generated motions. Extensive experiments on benchmark datasets demonstrate that SPARD significantly outperforms state-of-the-art methods in both inference efficiency and motion quality, achieving a 15× to 18× speedup in sampling time compared to conventional diffusion-based baselines while preserving generation quality.

SPARD: Single-step Inference with Adaptive Sampling in Residual Diffusion for Human Motion Prediction

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information