Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects

Authors

  • Hessam Bagherinezhad University of Washington
  • Hannaneh Hajishirzi University of Washington
  • Yejin Choi University of Washington
  • Ali Farhadi University of Washington

DOI:

https://doi.org/10.1609/aaai.v30i1.10476

Keywords:

language-vision, knowledge extraction, size information

Abstract

Human vision greatly benefits from the information about sizes of objects. The role of size in several visual reasoning tasks has been thoroughly explored in human perception and cognition. However, the impact of the information about sizes of objects is yet to be determined in AI. We postulate that this is mainly attributed to the lack of a comprehensive repository of size information. In this paper, we introduce a method to automatically infer object sizes, leveraging visual and textual information from web. By maximizing the joint likelihood of textual and visual observations, our method learns reliable relative size estimates, with no explicit human supervision. We introduce the relative size dataset and show that our method outperforms competitive textual and visual baselines in reasoning about size comparisons.

Downloads

Published

2016-03-05

How to Cite

Bagherinezhad, H., Hajishirzi, H., Choi, Y., & Farhadi, A. (2016). Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.10476