FEAST-Mamba: FEAture and SpaTial Aware Mamba Network with Bidirectional Orthogonal Fusion for Cross-Modal Point Cloud Segmentation

Authors

  • Chade Li State Key Laboratory of Multimodal Artificial Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
  • Pengju Zhang State Key Laboratory of Multimodal Artificial Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
  • Bo Liu State Key Laboratory of Multimodal Artificial Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
  • Hao Wei State Key Laboratory of Multimodal Artificial Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
  • Yihong Wu State Key Laboratory of Multimodal Artificial Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v39i5.32489

Abstract

Point cloud segmentation has a wide range of applications in autonomous driving, augmented reality and virtual reality. Multi-modal fusion strategies have received increasing attention in point cloud segmentation recently. Despite the success, existing methods usually generate unnecessary information loss or redundancy. In this paper, we propose FEAST-Mamba, a novel FEAture and SpaTial aware Mamba network to tackle multi-modal point cloud segmentation. To exploit the complementarity between different modals, we propose a bidirectional orthogonal attention module, where features are first bidirectionally interacted with each other through cross-modal attention, and then orthogonal fusion is used to reduce feature redundancy. Furthermore, a reordering strategy is proposed for the Mamba architecture that takes into account both spatial and semantic information during cross-modal feature ordering. Experiments on indoor datasets, S3DIS and ScanNet, and outdoor datasets, nuScenes and SemanticKITTI, show that the proposed method achieves state-of-the-art performances.

Downloads

Published

2025-04-11

How to Cite

Li, C., Zhang, P., Liu, B., Wei, H., & Wu, Y. (2025). FEAST-Mamba: FEAture and SpaTial Aware Mamba Network with Bidirectional Orthogonal Fusion for Cross-Modal Point Cloud Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(5), 4634–4642. https://doi.org/10.1609/aaai.v39i5.32489

Issue

Section

AAAI Technical Track on Computer Vision IV