OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement
DOI:
https://doi.org/10.1609/aaai.v37i1.25121Keywords:
CV: 3D Computer Vision, CV: Applications, APP: Internet of Things, Sensor Networks & Smart Cities, ROB: ApplicationsAbstract
Point cloud compression with a higher compression ratio and tiny loss is essential for efficient data transportation. However, previous methods that depend on 3D convolution or frequent multi-head self-attention operations bring huge computations. To address this problem, we propose an octree-based Transformer compression method called OctFormer, which does not rely on the occupancy information of sibling nodes. Our method uses non-overlapped context windows to construct octree node sequences and share the result of a multi-head self-attention operation among a sequence of nodes. Besides, we introduce a locally-enhance module for exploiting the sibling features and a positional encoding generator for enhancing the translation invariance of the octree node sequence. Compared to the previous state-of-the-art works, our method obtains up to 17% Bpp savings compared to the voxel-context-based baseline and saves an overall 99% coding time compared to the attention-based baseline.Downloads
Published
2023-06-26
How to Cite
Cui, M., Long, J., Feng, M., Li, B., & Kai, H. (2023). OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 470-478. https://doi.org/10.1609/aaai.v37i1.25121
Issue
Section
AAAI Technical Track on Computer Vision I