TransLO: A Window-Based Masked Point Transformer Framework for Large-Scale LiDAR Odometry

Authors

  • Jiuming Liu Shanghai Jiao Tong University
  • Guangming Wang Shanghai Jiao Tong University
  • Chaokang Jiang China University of Mining and Technology
  • Zhe Liu Shanghai Jiao Tong University
  • Hesheng Wang Shanghai Jiao Tong University

DOI:

https://doi.org/10.1609/aaai.v37i2.25256

Keywords:

CV: Vision for Robotics & Autonomous Driving, CV: 3D Computer Vision

Abstract

Recently, transformer architecture has gained great success in the computer vision community, such as image classification, object detection, etc. Nonetheless, its application for 3D vision remains to be explored, given that point cloud is inherently sparse, irregular, and unordered. Furthermore, existing point transformer frameworks usually feed raw point cloud of N×3 dimension into transformers, which limits the point processing scale because of their quadratic computational costs to the input size N. In this paper, we rethink the structure of point transformer. Instead of directly applying transformer to points, our network (TransLO) can process tens of thousands of points simultaneously by projecting points onto a 2D surface and then feeding them into a local transformer with linear complexity. Specifically, it is mainly composed of two components: Window-based Masked transformer with Self Attention (WMSA) to capture long-range dependencies; Masked Cross-Frame Attention (MCFA) to associate two frames and predict pose estimation. To deal with the sparsity issue of point cloud, we propose a binary mask to remove invalid and dynamic points. To our knowledge, this is the first transformer-based LiDAR odometry network. The experiment results on the KITTI odometry dataset show that our average rotation and translation RMSE achieves 0.500°/100m and 0.993% respectively. The performance of our network surpasses all recent learning-based methods and even outperforms LOAM on most evaluation sequences.Codes will be released on https://github.com/IRMVLab/TransLO.

Downloads

Published

2023-06-26

How to Cite

Liu, J., Wang, G., Jiang, C., Liu, Z., & Wang, H. (2023). TransLO: A Window-Based Masked Point Transformer Framework for Large-Scale LiDAR Odometry. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 1683-1691. https://doi.org/10.1609/aaai.v37i2.25256

Issue

Section

AAAI Technical Track on Computer Vision II