OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement

Authors

  • Mingyue Cui Sun Yat-sen University
  • Junhua Long Sun Yat-sen University
  • Mingjian Feng Sun Yat-sen University
  • Boyang Li Sun Yat-sen University
  • Huang Kai Sun Yat-Sen University

DOI:

https://doi.org/10.1609/aaai.v37i1.25121

Keywords:

CV: 3D Computer Vision, CV: Applications, APP: Internet of Things, Sensor Networks & Smart Cities, ROB: Applications

Abstract

Point cloud compression with a higher compression ratio and tiny loss is essential for efficient data transportation. However, previous methods that depend on 3D convolution or frequent multi-head self-attention operations bring huge computations. To address this problem, we propose an octree-based Transformer compression method called OctFormer, which does not rely on the occupancy information of sibling nodes. Our method uses non-overlapped context windows to construct octree node sequences and share the result of a multi-head self-attention operation among a sequence of nodes. Besides, we introduce a locally-enhance module for exploiting the sibling features and a positional encoding generator for enhancing the translation invariance of the octree node sequence. Compared to the previous state-of-the-art works, our method obtains up to 17% Bpp savings compared to the voxel-context-based baseline and saves an overall 99% coding time compared to the attention-based baseline.

Downloads

Published

2023-06-26

How to Cite

Cui, M., Long, J., Feng, M., Li, B., & Kai, H. (2023). OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 470-478. https://doi.org/10.1609/aaai.v37i1.25121

Issue

Section

AAAI Technical Track on Computer Vision I