Laneformer: Object-Aware Row-Column Transformers for Lane Detection

Authors

  • Jianhua Han Huawei Noah's Ark Lab
  • Xiajun Deng Sun Yat-sen University
  • Xinyue Cai Huawei Noah's Ark Lab
  • Zhen Yang Huawei Noah’s Ark Lab
  • Hang Xu Huawei Noah's Ark Lab
  • Chunjing Xu Huawei Noah's Ark Lab
  • Xiaodan Liang Sun Yat-sen University

DOI:

https://doi.org/10.1609/aaai.v36i1.19961

Keywords:

Computer Vision (CV)

Abstract

We present Laneformer, a conceptually simple yet powerful transformer-based architecture tailored for lane detection that is a long-standing research topic for visual perception in autonomous driving. The dominant paradigms rely on purely CNN-based architectures which often fail in incorporating relations of long-range lane points and global contexts induced by surrounding objects (e.g., pedestrians, vehicles). Inspired by recent advances of the transformer encoder-decoder architecture in various vision tasks, we move forwards to design a new end-to-end Laneformer architecture that revolutionizes the conventional transformers into better capturing the shape and semantic characteristics of lanes, with minimal overhead in latency. First, coupling with deformable pixel-wise self-attention in the encoder, Laneformer presents two new row and column self-attention operations to efficiently mine point context along with the lane shapes. Second, motivated by the appearing objects would affect the decision of predicting lane segments, Laneformer further includes the detected object instances as extra inputs of multi-head attention blocks in the encoder and decoder to facilitate the lane point detection by sensing semantic contexts. Specifically, the bounding box locations of objects are added into Key module to provide interaction with each pixel and query while the ROI-aligned features are inserted into Value module. Extensive experiments demonstrate our Laneformer achieves state-of-the-art performances on CULane benchmark, in terms of 77.1% F1 score. We hope our simple and effective Laneformer will serve as a strong baseline for future research in self-attention models for lane detection.

Downloads

Published

2022-06-28

How to Cite

Han, J., Deng, X., Cai, X., Yang, Z., Xu, H., Xu, C., & Liang, X. (2022). Laneformer: Object-Aware Row-Column Transformers for Lane Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 36(1), 799-807. https://doi.org/10.1609/aaai.v36i1.19961

Issue

Section

AAAI Technical Track on Computer Vision I