PointMC: Multi-view Consistent Encoding and Center-Global Feature Fusion for Point Clouds Understanding

Xinxing Yu; Ajian Liu; Sunyuan Qiang; Yuzhong Wang; Hui Ma; Yanyan Liang

doi:10.1609/aaai.v40i14.38207

Authors

Xinxing Yu Faculty of Innovation Engineering, Macau University of Science and Technology
Ajian Liu Faculty of Innovation Engineering, Macau University of Science and Technology MAIS, Institute of Automation, Chinese Academy of Sciences
Sunyuan Qiang Southwest Institute of Technical Physics
Yuzhong Wang Faculty of Innovation Engineering, Macau University of Science and Technology
Hui Ma Faculty of Innovation Engineering, Macau University of Science and Technology School of Computing and Information Technology, Great Bay University
Yanyan Liang Faculty of Innovation Engineering, Macau University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v40i14.38207

Abstract

Point cloud tasks have recently benefited from Mamba-based architecture, which leverage state space modeling to achieve strong performance. Previous studies have primarily focused on network design while overlooking the importance of position encoding and relying on coarse-grained geometric feature aggregation. The former leads to semantic ambiguity due to inconsistent spatial relationships, while the latter results in geometric feature dispersion by overlooking fine-grained local geometric details. To tackle the above problem, we propose a novel framework, PointMC, including Multi-view Consistent Learnable Position Encoding (MCLPE) and Center-Global Feature Fusion (CGFF), to provide semantically coherent positional guidance for inter-patch and enable fine-grained geometric structure aggregation within intra-patch regions. Specifically, the proposed MCLPE module is inspired by a spatial structure modeling mechanism guided by physical constraints, leverages multi-view virtual reconstruction and a learnable strategy to dynamically constrain spatial relationships along patch boundaries, thereby enhancing the semantic consistency and representational clarity across inter-patch regions. Furthermore, considering the lack of local structural information within each patch, the CGFF module employs a dual-guidance mechanism based on center and global structures to effectively promote the aggregation of local geometric features. Extensive experiments on multiple benchmark datasets validate the effectiveness of PointMC, consistently outperforming existing state-of-the-art methods, and demonstrating superior capability in capturing both inter-patch semantic consistency and intra-patch geometric details.

PointMC: Multi-view Consistent Encoding and Center-Global Feature Fusion for Point Clouds Understanding

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information