PHFormer: Multi-Fragment Assembly Using Proxy-Level Hybrid Transformer

Authors

  • Wenting Cui Xi'an Jiaotong University
  • Runzhao Yao Xi'an Jiaotong University
  • Shaoyi Du Xi'an Jiaotong Unviersity

DOI:

https://doi.org/10.1609/aaai.v38i2.27905

Keywords:

CV: 3D Computer Vision

Abstract

Fragment assembly involves restoring broken objects to their original geometries, and has many applications, such as archaeological restoration. Existing learning based frameworks have shown potential for solving part assembly problems with semantic decomposition, but cannot handle such geometrical decomposition problems. In this work, we propose a novel assembly framework, proxy level hybrid Transformer, with the core idea of using a hybrid graph to model and reason complex structural relationships between patches of fragments, dubbed as proxies. To this end, we propose a hybrid attention module, composed of intra and inter attention layers, enabling capturing of crucial contextual information within fragments and relative structural knowledge across fragments. Furthermore, we propose an adjacency aware hierarchical pose estimator, exploiting a decompose and integrate strategy. It progressively predicts adjacent probability and relative poses between fragments, and then implicitly infers their absolute poses by dynamic information integration. Extensive experimental results demonstrate that our method effectively reduces assembly errors while maintaining fast inference speed. The code is available at https://github.com/521piglet/PHFormer.

Published

2024-03-24

How to Cite

Cui, W., Yao, R., & Du, S. (2024). PHFormer: Multi-Fragment Assembly Using Proxy-Level Hybrid Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1408-1416. https://doi.org/10.1609/aaai.v38i2.27905

Issue

Section

AAAI Technical Track on Computer Vision I