Self-Supervised Bird’s Eye View Motion Prediction with Cross-Modality Signals
DOI:
https://doi.org/10.1609/aaai.v38i2.27940Keywords:
CV: Vision for Robotics & Autonomous Driving, ML: Unsupervised & Self-Supervised LearningAbstract
Learning the dense bird's eye view (BEV) motion flow in a self-supervised manner is an emerging research for robotics and autonomous driving. Current self-supervised methods mainly rely on point correspondences between point clouds, which may introduce the problems of fake flow and inconsistency, hindering the model’s ability to learn accurate and realistic motion. In this paper, we introduce a novel cross-modality self-supervised training framework that effectively addresses these issues by leveraging multi-modality data to obtain supervision signals. We design three innovative supervision signals to preserve the inherent properties of scene motion, including the masked Chamfer distance loss, the piecewise rigidity loss, and the temporal consistency loss. Through extensive experiments, we demonstrate that our proposed self-supervised framework outperforms all previous self-supervision methods for the motion prediction task.Downloads
Published
2024-03-24
How to Cite
Fang, S., Liu, Z., Wang, M., Xu, C., Zhong, Y., & Chen, S. (2024). Self-Supervised Bird’s Eye View Motion Prediction with Cross-Modality Signals. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1726–1734. https://doi.org/10.1609/aaai.v38i2.27940
Issue
Section
AAAI Technical Track on Computer Vision I