SocialCVAE: Predicting Pedestrian Trajectory via Interaction Conditioned Latents


  • Wei Xiang Zhejiang University
  • Haoteng YIN Purdue University
  • He Wang University College London
  • Xiaogang Jin Zhejiang University



CV: Motion & Tracking, ROB: Motion and Path Planning, MAS: Agent-Based Simulation and Emergent Behavior


Pedestrian trajectory prediction is the key technology in many applications for providing insights into human behavior and anticipating human future motions. Most existing empirical models are explicitly formulated by observed human behaviors using explicable mathematical terms with deterministic nature, while recent work has focused on developing hybrid models combined with learning-based techniques for powerful expressiveness while maintaining explainability. However, the deterministic nature of the learned steering behaviors from the empirical models limits the models' practical performance. To address this issue, this work proposes the social conditional variational autoencoder (SocialCVAE) for predicting pedestrian trajectories, which employs a CVAE to explore behavioral uncertainty in human motion decisions. SocialCVAE learns socially reasonable motion randomness by utilizing a socially explainable interaction energy map as the CVAE's condition, which illustrates the future occupancy of each pedestrian's local neighborhood area. The energy map is generated using an energy-based interaction model, which anticipates the energy cost (i.e., repulsion intensity) of pedestrians' interactions with neighbors. Experimental results on two public benchmarks including 25 scenes demonstrate that SocialCVAE significantly improves prediction accuracy compared with the state-of-the-art methods, with up to 16.85% improvement in Average Displacement Error (ADE) and 69.18% improvement in Final Displacement Error (FDE). Code is available at:



How to Cite

Xiang, W., YIN, H., Wang, H., & Jin, X. (2024). SocialCVAE: Predicting Pedestrian Trajectory via Interaction Conditioned Latents. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 6216-6224.



AAAI Technical Track on Computer Vision V