A Scalable Tree-Based Approach for Joint Object and Pose Recognition


  • Kevin Lai University of Washington
  • Liefeng Bo University of Washington
  • Xiaofeng Ren Intel Labs
  • Dieter Fox University of Washington


Recognizing possibly thousands of objects is a crucial capability for an autonomous agent to understand and interact with everyday environments. Practical object recognition comes in multiple forms: Is this a coffee mug (category recognition). Is this Alice's coffee mug? (instance recognition). Is the mug with the handle facing left or right? (pose recognition). We present a scalable framework, Object-Pose Tree, which efficiently organizes data into a semantically structured tree. The tree structure enables both scalable training and testing, allowing us to solve recognition over thousands of object poses in near real-time. Moreover, by simultaneously optimizing all three tasks, our approach outperforms standard nearest neighbor and 1-vs-all classifications, with large improvements on pose recognition. We evaluate the proposed technique on a dataset of 300 household objects collected using a Kinect-style 3D camera. Experiments demonstrate that our system achieves robust and efficient object category, instance, and pose recognition on challenging everyday objects.




How to Cite

Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A Scalable Tree-Based Approach for Joint Object and Pose Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 25(1), 1474-1480. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/7986