IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion

Authors

  • Wenhao Hu Zhejiang University Horizon Robotics
  • Zesheng Li Nanyang Technological University
  • Haonan Zhou Zhejiang University
  • Liu Liu Horizon Robotics
  • Xuexiang Wen Zhejiang University
  • Zhizhong Su Horizon Robotics
  • Xi Li Zhejiang University
  • Gaoang Wang Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v40i6.42497

Abstract

Reconstructing complete and interactive 3D scenes remains a fundamental challenge in computer vision and robotics, particularly due to persistent object occlusions and limited sensor coverage. Even multi-view observations from a single scene scan often fail to capture the full structural details. Existing approaches typically rely on multi-stage pipelines—such as segmentation, background completion, and inpainting—or require per-object dense scanning, both of which are error-prone, and not easily scalable. We propose IGFuse, a novel framework that reconstructs interactive Gaussian scene by fusing observations from multiple scans, where natural object rearrangement between captures reveal previously occluded regions. Our method constructs segmentation-aware Gaussian fields and enforces bi-directional photometric and semantic consistency across scans. To handle spatial misalignments, we introduce a pseudo-intermediate scene state for symmetric alignment, alongside collaborative co-pruning strategies to refine geometry. IGFuse enables high-fidelity rendering and object-level scene manipulation without dense observations or complex pipelines. Extensive experiments validate the framework’s strong generalization to novel scene configurations, demonstrating its effectiveness for real-world 3D reconstruction and real-to-simulation transfer.

Published

2026-03-14

How to Cite

Hu, W., Li, Z., Zhou, H., Liu, L., Wen, X., Su, Z., … Wang, G. (2026). IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4932–4940. https://doi.org/10.1609/aaai.v40i6.42497

Issue

Section

AAAI Technical Track on Computer Vision III