OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement

Mingyue Cui; Junhua Long; Mingjian Feng; Boyang Li; Huang Kai

doi:10.1609/aaai.v37i1.25121

Authors

Mingyue Cui Sun Yat-sen University
Junhua Long Sun Yat-sen University
Mingjian Feng Sun Yat-sen University
Boyang Li Sun Yat-sen University
Huang Kai Sun Yat-Sen University

DOI:

https://doi.org/10.1609/aaai.v37i1.25121

Keywords:

CV: 3D Computer Vision, CV: Applications, APP: Internet of Things, Sensor Networks & Smart Cities, ROB: Applications

Abstract

Point cloud compression with a higher compression ratio and tiny loss is essential for efficient data transportation. However, previous methods that depend on 3D convolution or frequent multi-head self-attention operations bring huge computations. To address this problem, we propose an octree-based Transformer compression method called OctFormer, which does not rely on the occupancy information of sibling nodes. Our method uses non-overlapped context windows to construct octree node sequences and share the result of a multi-head self-attention operation among a sequence of nodes. Besides, we introduce a locally-enhance module for exploiting the sibling features and a positional encoding generator for enhancing the translation invariance of the octree node sequence. Compared to the previous state-of-the-art works, our method obtains up to 17% Bpp savings compared to the voxel-context-based baseline and saves an overall 99% coding time compared to the attention-based baseline.

OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription