End-to-End Real-Time Vanishing Point Detection with Transformer
DOI:
https://doi.org/10.1609/aaai.v38i6.28331Keywords:
CV: 3D Computer Vision, CV: Low Level & Physics-based VisionAbstract
In this paper, we propose a novel transformer-based end-to-end real-time vanishing point detection method, which is named Vanishing Point TRansformer (VPTR). The proposed method can directly regress the locations of vanishing points from given images. To achieve this goal, we pose vanishing point detection as a point object detection task on the Gaussian hemisphere with region division. Considering low-level features always provide more geometric information which can contribute to accurate vanishing point prediction, we propose a clear architecture where vanishing point queries in the decoder can directly gather multi-level features from CNN backbone with deformable attention in VPTR. Our method does not rely on line detection or Manhattan world assumption, which makes it more flexible to use. VPTR runs at an inferring speed of 140 FPS on one NVIDIA 3090 card. Experimental results on synthetic and real-world datasets demonstrate that our method can be used in both natural and structural scenes, and is superior to other state-of-the-art methods on the balance of accuracy and efficiency.Downloads
Published
2024-03-24
How to Cite
Tong, X., Peng, S., Guo, Y., & Huang, X. (2024). End-to-End Real-Time Vanishing Point Detection with Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 5243-5251. https://doi.org/10.1609/aaai.v38i6.28331
Issue
Section
AAAI Technical Track on Computer Vision V