GridFormer: Point-Grid Transformer for Surface Reconstruction

Authors

  • Shengtao Li Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, China School of Software, Tsinghua University, Beijing, China
  • Ge Gao Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, China School of Software, Tsinghua University, Beijing, China
  • Yudong Liu Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, China School of Software, Tsinghua University, Beijing, China
  • Yu-Shen Liu School of Software, Tsinghua University, Beijing, China
  • Ming Gu Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, China School of Software, Tsinghua University, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v38i4.28100

Keywords:

CV: 3D Computer Vision

Abstract

Implicit neural networks have emerged as a crucial technology in 3D surface reconstruction. To reconstruct continuous surfaces from discrete point clouds, encoding the input points into regular grid features (plane or volume) has been commonly employed in existing approaches. However, these methods typically use the grid as an index for uniformly scattering point features. Compared with the irregular point features, the regular grid features may sacrifice some reconstruction details but improve efficiency. To take full advantage of these two types of features, we introduce a novel and high-efficiency attention mechanism between the grid and point features named Point-Grid Transformer (GridFormer). This mechanism treats the grid as a transfer point connecting the space and point cloud. Our method maximizes the spatial expressiveness of grid features and maintains computational efficiency. Furthermore, optimizing predictions over the entire space could potentially result in blurred boundaries. To address this issue, we further propose a boundary optimization strategy incorporating margin binary cross-entropy loss and boundary sampling. This approach enables us to achieve a more precise representation of the object structure. Our experiments validate that our method is effective and outperforms the state-of-the-art approaches under widely used benchmarks by producing more precise geometry reconstructions. The code is available at https://github.com/list17/GridFormer.

Published

2024-03-24

How to Cite

Li, S., Gao, G., Liu, Y., Liu, Y.-S., & Gu, M. (2024). GridFormer: Point-Grid Transformer for Surface Reconstruction. Proceedings of the AAAI Conference on Artificial Intelligence, 38(4), 3163-3171. https://doi.org/10.1609/aaai.v38i4.28100

Issue

Section

AAAI Technical Track on Computer Vision III