Temporal Pyramid Network for Pedestrian Trajectory Prediction with Multi-Supervision

Authors

  • Rongqin Liang Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University
  • Yuanman Li Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University
  • Xia Li Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University
  • Yi Tang Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University
  • Jiantao Zhou State Key Laboratory of Internet of Things for Smart City, Department of Computer and Information Science, University of Macau
  • Wenbin Zou Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University

DOI:

https://doi.org/10.1609/aaai.v35i3.16299

Keywords:

Motion & Tracking, Video Understanding & Activity Analysis, Motion and Path Planning, Behavior Learning & Control

Abstract

Predicting human motion behavior in a crowd is important for many applications, ranging from the natural navigation of autonomous vehicles to intelligent security systems of video surveillance. All the previous works model and predict the trajectory with a single resolution, which is relatively ineffective and difficult to simultaneously exploit the long-range information (e.g., the destination of the trajectory), and the short-range information (e.g., the walking direction and speed at a certain time) of the motion behavior. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. Our hierarchical framework builds a feature pyramid with increasingly richer temporal information from top to bottom, which can better capture the motion behavior at various tempos. Furthermore, we propose a coarse-to-fine fusion strategy with multi-supervision. By progressively merging the top coarse features of global context to the bottom fine features of rich local context, our method can fully exploit both the long-range and short-range information of the trajectory. Experimental results on two benchmarks demonstrate the superiority of our method. Our code and models will be available upon acceptance.

Downloads

Published

2021-05-18

How to Cite

Liang, R., Li, Y., Li, X., Tang, Y., Zhou, J., & Zou, W. (2021). Temporal Pyramid Network for Pedestrian Trajectory Prediction with Multi-Supervision. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3), 2029-2037. https://doi.org/10.1609/aaai.v35i3.16299

Issue

Section

AAAI Technical Track on Computer Vision II