Cross Media Entity Extraction and Linkage for Chemical Documents

Su Yan; Scott Spangler; Ying Chen

doi:10.1609/aaai.v25i1.7832

Cross Media Entity Extraction and Linkage for Chemical Documents

Authors

Su Yan IBM Almaden Research Lab
Scott Spangler IBM Almaden Research Lab
Ying Chen IBM Almaden Research Lab

DOI:

https://doi.org/10.1609/aaai.v25i1.7832

Abstract

Text and images are two major sources of information in scientific literature. Information from these two media typically reinforce and complement each other, thus simplifying the process for human to extract and comprehend information. However, machines cannot create the links or have the semantic understanding between images and text. We propose to integrate text analysis and image processing techniques to bridge the gap between the two media, and discover knowledge from the combined information sources, which would be otherwise lost by traditional single-media based mining systems. The focus is on the chemical entity extraction task because images are well known to add value to the textual content in chemical literature. Annotation of US chemical patent documents demonstrates the effectiveness of our proposal.

Downloads

Published

2011-08-04

How to Cite

Yan, S., Spangler, S., & Chen, Y. (2011). Cross Media Entity Extraction and Linkage for Chemical Documents. Proceedings of the AAAI Conference on Artificial Intelligence, 25(1), 1455–1460. https://doi.org/10.1609/aaai.v25i1.7832

Download Citation

Issue

Vol. 25 No. 1 (2011): Twenty-Fifth AAAI Conference on Artificial Intelligence

Section

Special Track on Integrated Intelligence