MSV-PCT: Multi-Sparse-View Enhanced Transformer Framework for Salient Object Detection in Point Clouds

Authors

  • Zihao Wang Beijing University of Technology
  • Yiming Huang Beijing University of Technology
  • Gengyu Lyu Beijing University of Technology
  • Yucheng Zhao Beijing University of Technology
  • Ziyu Zhou Beijing University of Technology
  • Bochen Xie City University of Hong Kong
  • Zhen Yang Beijing University of Technology
  • Yongjian Deng Beijing University of Technology

DOI:

https://doi.org/10.1609/aaai.v39i8.32892

Abstract

Salient object detection (SOD) methods for 2D images have great significance in the field of human-computer interaction (HCI). However, as a common data format in HCI, the SOD research in the form of 3D point cloud data remains limited. Previous works commonly treat this task as point cloud segmentation, which perceives all points in the scene for prediction. However, these methods neglect that SOD is designed to simulate human visual perception where human can only see the surfaces rather than occluded point clouds. Thereby, these methods may fail when meet such situations. This paper aims to solve this problem by approximately simulating the perception paradigm of humans towards 3D scenes. Thus, we propose a framework based on the 3D visual point cloud backbone and its multi-view projection named MSV-PCT. Specifically, instead of relying solely on general point cloud learning frameworks, we additionally introduce multi-sparse-view learning branches to supplement the SOD perception. Furthermore, we propose a novel point cloud edge detection loss function to effectively address artifacts, enabling the accurate segmentation of the edges of salient objects from the background. Finally, to evaluate the generalization of point cloud SOD methods, we introduce a new approach to generate simulated PC-SOD datasets from RGBD-SOD data. Experiments on the simulated datasets show that MSV-PCT achieves better accuracy and robustness.

Downloads

Published

2025-04-11

How to Cite

Wang, Z., Huang, Y., Lyu, G., Zhao, Y., Zhou, Z., Xie, B., … Deng, Y. (2025). MSV-PCT: Multi-Sparse-View Enhanced Transformer Framework for Salient Object Detection in Point Clouds. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8268–8276. https://doi.org/10.1609/aaai.v39i8.32892

Issue

Section

AAAI Technical Track on Computer Vision VII