DISCO: Describing Images Using Scene Contexts and Objects

Ifeoma Nwogu; Yingbo Zhou; Christopher Brown

doi:10.1609/aaai.v25i1.7978

DISCO: Describing Images Using Scene Contexts and Objects

Authors

Ifeoma Nwogu University of Rochester
Yingbo Zhou University at Buffalo, State University of New York
Christopher Brown University of Rochester

DOI:

https://doi.org/10.1609/aaai.v25i1.7978

Abstract

In this paper, we propose a bottom-up approach to generating short descriptive sentences from images, to enhance scene understanding. We demonstrate automatic methods for mapping the visual content in an image to natural spoken or written language. We also introduce a human-in-the-loop evaluation strategy that quantitatively captures the meaningfulness of the generated sentences. We recorded a correctness rate of 60.34% when human users were asked to judge the meaningfulness of the sentences generated from relatively challenging images. Also, our automatic methods compared well with the state-of-the-art techniques for the related computer vision tasks.

Downloads

Published

2011-08-04

How to Cite

Nwogu, I., Zhou, Y., & Brown, C. (2011). DISCO: Describing Images Using Scene Contexts and Objects. Proceedings of the AAAI Conference on Artificial Intelligence, 25(1), 1487-1493. https://doi.org/10.1609/aaai.v25i1.7978

Download Citation

Issue

Vol. 25 No. 1 (2011): Twenty-Fifth AAAI Conference on Artificial Intelligence

Section

Physically Grounded AI Special Track

DISCO: Describing Images Using Scene Contexts and Objects

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription