Universal Features Guided Zero-Shot Category-Level Object Pose Estimation

Authors

  • Wentian Qu Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences Hong Kong University of Science and Technology
  • Chenyu Meng Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Heng Li Hong Kong University of Science and Technology
  • Jian Cheng Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Cuixia Ma Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Hongan Wang Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Xiao Zhou Aerospace Information Research Institute, Chinese Academy of Sciences
  • Xiaoming Deng Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences
  • Ping Tan Hong Kong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v39i6.32713

Abstract

Object pose estimation, crucial in computer vision and robotics applications, faces challenges with the diversity of unseen categories. We propose a zero-shot method to achieve category-level 6-DOF object pose estimation, which exploits both 2D and 3D universal features of input RGB-D image to establish semantic similarity-based correspondences and can be extended to unseen categories without additional model fine-tuning. Our method begins with combining efficient 2D universal features to find sparse correspondences between intra-category objects and gets initial coarse pose. To handle the correspondence degradation of 2D universal features if the pose deviates much from the target pose, we use an iterative strategy to optimize the pose. Subsequently, to resolve pose ambiguities due to shape differences between intra-category objects, the coarse pose is refined by optimizing with dense alignment constraint of 3D universal features. Our method outperforms previous methods on the REAL275 and Wild6D benchmarks for unseen categories.

Downloads

Published

2025-04-11

How to Cite

Qu, W., Meng, C., Li, H., Cheng, J., Ma, C., Wang, H., … Tan, P. (2025). Universal Features Guided Zero-Shot Category-Level Object Pose Estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(6), 6648–6656. https://doi.org/10.1609/aaai.v39i6.32713

Issue

Section

AAAI Technical Track on Computer Vision V