Not Quite the Same: Identity Constraints for the Web of Linked Data

Authors

  • Gerard de Melo ICSI Berkeley

DOI:

https://doi.org/10.1609/aaai.v27i1.8468

Keywords:

Linked Data, Identity

Abstract

Linked Data is based on the idea that information from different sources can flexibly be connected to enable novel applications that individual datasets do not support on their own. This hinges upon the existence of links between datasets that would otherwise be isolated. The most notable form, sameAs links, are intended to express that two identifiers are equivalent in all respects. Unfortunately, many existing ones do not reflect such genuine identity. This study provides a novel method to analyse this phenomenon, based on a thorough theoretical analysis, as well as a novel graph-based method to resolve such issues to some extent. Our experiments on a representative Web-scale set of sameAs links from the Web of Data show that our method can identify and remove hundreds of thousands of constraint violations.

Linked Data is based on the idea that information from different sources can flexibly be connected to enable novel applications that individual datasets do not support on their own. This hinges upon the existence of links between datasets that would otherwise be isolated. The most notable form, sameAs links, are intended to express that two identifiers are equivalent in all respects. Unfortunately, many existing ones do not reflect such genuine identity. This study provides a novel method to analyse this phenomenon, based on a thorough theoretical analysis, as well as a novel graph-based method to resolve such issues to some extent. Our experiments on a representative Web-scale set of sameAs links from the Web of Data show that our method can identify and remove hundreds of thousands of constraint violations.

Downloads

Published

2013-06-29

How to Cite

de Melo, G. (2013). Not Quite the Same: Identity Constraints for the Web of Linked Data. Proceedings of the AAAI Conference on Artificial Intelligence, 27(1), 1092-1098. https://doi.org/10.1609/aaai.v27i1.8468