Information Extraction as Link Prediction: Using Curated Citation Networks to Improve Gene Detection

Authors

  • Andrew Arnold Carnegie Mellon University
  • William Cohen Carnegie Mellon University

Keywords:

Information extraction, citation networks, link prediction

Abstract

In this paper we explore the usefulness of various types of publication-related metadata, such as citation networks and curated databases, for the task of identifying genes in academic biomedical publications.  Specifically, we examine whether knowing something about which genes an author has previously written about, combined with information about previous coauthors and citations, can help us predict which new genes the author is likely to write about in the future.  Framed in this way, the problem becomes one of predicting links between authors and genes in the publication network.  We show that this solely social-network based link prediction technique outperforms various baselines, including those relying only on non-social biological information.

Downloads

Published

2009-03-20

How to Cite

Arnold, A., & Cohen, W. (2009). Information Extraction as Link Prediction: Using Curated Citation Networks to Improve Gene Detection. Proceedings of the International AAAI Conference on Web and Social Media, 3(1), 175-178. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/13967