Improved MLP Point Cloud Processing with High-Dimensional Positional Encoding

Yanmei Zou; Hongshan Yu; Zhengeng Yang; Zechuan Li; Naveed Akhtar

doi:10.1609/aaai.v38i7.28625

Authors

Yanmei Zou Hunan University
Hongshan Yu Hunan University
Zhengeng Yang Hunan Normal University
Zechuan Li Hunan University
Naveed Akhtar The University of Melbourne

DOI:

https://doi.org/10.1609/aaai.v38i7.28625

Keywords:

CV: 3D Computer Vision, CV: Scene Analysis & Understanding, CV: Segmentation

Abstract

Multi-Layer Perceptron (MLP) models are the bedrock of contemporary point cloud processing. However, their complex network architectures obscure the source of their strength. We first develop an “abstraction and refinement” (ABS-REF) view for the neural modeling of point clouds. This view elucidates that whereas the early models focused on the ABS stage, the more recent techniques devise sophisticated REF stages to attain performance advantage in point cloud processing. We then borrow the concept of “positional encoding” from transformer literature, and propose a High-dimensional Positional Encoding (HPE) module, which can be readily deployed to MLP based architectures. We leverage our module to develop a suite of HPENet, which are MLP networks that follow ABS-REF paradigm, albeit with a sophisticated HPE based REF stage. The developed technique is extensively evaluated for 3D object classification, object part segmentation, semantic segmentation and object detection. We establish new state-of-the-art results of 87.6 mAcc on ScanObjectNN for object classification, and 85.5 class mIoU on ShapeNetPart for object part segmentation, and 72.7 and 78.7 mIoU on Area-5 and 6-fold experiments with S3DIS for semantic segmentation. The source code for this work is available at https://github.com/zouyanmei/HPENet.

Improved MLP Point Cloud Processing with High-Dimensional Positional Encoding

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information