Progress and Limitations of Deep Networks to Recognize Objects in Unusual Poses

Authors

  • Amro Abbas The African Institute For Mathematical Sciences
  • Stéphane Deny Aalto University

DOI:

https://doi.org/10.1609/aaai.v37i1.25087

Keywords:

CV: Adversarial Attacks & Robustness, CV: Scene Analysis & Understanding, ML: Deep Neural Architectures

Abstract

Deep networks should be robust to rare events if they are to be successfully deployed in high-stakes real-world applications. Here we study the capability of deep networks to recognize objects in unusual poses. We create a synthetic dataset of images of objects in unusual orientations, and evaluate the robustness of a collection of 38 recent and competitive deep networks for image classification. We show that classifying these images is still a challenge for all networks tested, with an average accuracy drop of 29.5% compared to when the objects are presented upright. This brittleness is largely unaffected by various design choices, such as training losses, architectures, dataset modalities, and data-augmentation schemes. However, networks trained on very large datasets substantially outperform others, with the best network tested—Noisy Student trained on JFT-300M—showing a relatively small accuracy drop of only 14.5% on unusual poses. Nevertheless, a visual inspection of the failures of Noisy Student reveals a remaining gap in robustness with humans. Furthermore, combining multiple object transformations—3D-rotations and scaling—further degrades the performance of all networks. Our results provide another measurement of the robustness of deep networks to consider when using them in the real world. Code and datasets are available at https://github.com/amro-kamal/ObjectPose.

Downloads

Published

2023-06-26

How to Cite

Abbas, A., & Deny, S. (2023). Progress and Limitations of Deep Networks to Recognize Objects in Unusual Poses. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 160-168. https://doi.org/10.1609/aaai.v37i1.25087

Issue

Section

AAAI Technical Track on Computer Vision I