PointChain: Learning Generalizable Point Cloud Representations via Structural Chain Modeling

Authors

  • Luyao Wang University of Science and Technology of China National Key Laboratoray of Deep Space Exploration, Deep Space Exploration Laboratory
  • Chuxin Wang University of Science and Technology of China National Key Laboratoray of Deep Space Exploration, Deep Space Exploration Laboratory
  • Qiao Li University of Science and Technology of China
  • Tianzhu Zhang University of Science and Technology of China National Key Laboratoray of Deep Space Exploration, Deep Space Exploration Laboratory

DOI:

https://doi.org/10.1609/aaai.v40i12.37961

Abstract

Recent advances in point cloud analysis have increasingly leveraged large-scale unlabeled data through self-supervised representation learning. Autoregressive models based on next-token prediction have shown strong performance, but they usually model point clouds as linear sequences, ignoring their inherent spatial structure. To address this limitation, we propose PointChain, a novel autoregressive paradigm inspired by human perception mechanisms, designed to better align with the structural properties of point cloud. Specifically, we introduce structural chain encoding, which models the understanding process as a global-to-local structural chain inference, preserving spatial relationships throughout the prediction sequence. During pre-training, we design two auxiliary tasks: a next-scale prediction task that encourages cross-scale reasoning, and a scale-level contrastive learning task that promotes semantic consistency across scales. These components guide the model to learn more discriminative and generalizable point cloud representations. Experiments on multiple benchmarks, using both Transformer and Mamba backbones, validate the effectiveness of our approach. PointChain achieves state-of-the-art performance on several downstream tasks, including 93.75% accuracy on the hardest split of ScanObjectNN without voting strategy.

Downloads

Published

2026-03-14

How to Cite

Wang, L., Wang, C., Li, Q., & Zhang, T. (2026). PointChain: Learning Generalizable Point Cloud Representations via Structural Chain Modeling. Proceedings of the AAAI Conference on Artificial Intelligence, 40(12), 9957-9965. https://doi.org/10.1609/aaai.v40i12.37961

Issue

Section

AAAI Technical Track on Computer Vision IX