SPARD: Single-step Inference with Adaptive Sampling in Residual Diffusion for Human Motion Prediction

Authors

  • Yiming Zhang Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University
  • Baojia Han Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University
  • Ximing Li Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University RIKEN Center for Advanced Intelligence Project, Japan
  • Wei Pang School of Mathematical and Computer Sciences, Heriot-Watt University
  • Fausto Giunchiglia Department of Information Engineering and Computer Science, University of Trento
  • Xiaoyue Feng Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University
  • Renchu Guan Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University

DOI:

https://doi.org/10.1609/aaai.v40i21.38865

Abstract

The task of stochastic human motion prediction has attracted significant attention in recent years due to its wide-ranging applications in robotics, animation, and human-computer interaction. While diffusion models have demonstrated promising progress in this domain, they remain hindered by two critical limitations: (1) slow inference speeds due to their reliance on iterative sampling, and (2) performance degradation resulting from suboptimal sample allocation during generation. To overcome these challenges, we propose SPARD (Single-step Inference with Adaptive Sampling in Residual Diffusion for Human Motion Prediction), a novel framework that achieves efficient single-step inference while maintaining high predictive accuracy. Furthermore, we introduce a novel adaptive noise predictor module that dynamically samples latent representations based on observed motion sequences, ensuring both accuracy and plausibility in generated motions. Extensive experiments on benchmark datasets demonstrate that SPARD significantly outperforms state-of-the-art methods in both inference efficiency and motion quality, achieving a 15× to 18× speedup in sampling time compared to conventional diffusion-based baselines while preserving generation quality.

Downloads

Published

2026-03-14

How to Cite

Zhang, Y., Han, B., Li, X., Pang, W., Giunchiglia, F., Feng, X., & Guan, R. (2026). SPARD: Single-step Inference with Adaptive Sampling in Residual Diffusion for Human Motion Prediction. Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), 18046–18054. https://doi.org/10.1609/aaai.v40i21.38865

Issue

Section

AAAI Technical Track on Humans and AI