Network Sampling Designs for Relational Classification

Nesreen Ahmed; Jennifer Neville; Ramana Kompella

doi:10.1609/icwsm.v6i1.14331

Authors

Nesreen Ahmed Purdue University
Jennifer Neville Purdue University
Ramana Kompella Purdue University

DOI:

https://doi.org/10.1609/icwsm.v6i1.14331

Keywords:

Network Sampling, Relational Classification

Abstract

Relational classification has been extensively studied recently due to its applications in social, biological, technological, and information networks. Much of the work in relational learning has focused on analyzing input data that comprise a single network. Although machine learning researchers have considered the issue of how to sample training and test sets from the input network (for evaluation), the mechanisms which are used to construct the input networks have largely been ignored. In most cases, the input network has itself been sampled from a larger target network (e.g., Facebook) and often the researcher is unaware of how the input network was constructed or what impact that may have on evaluation of the relational models. Since the goal in evaluating relational classification algorithms is to accurately assess their performance on the larger target network, it is critical to understand what impact the initial sampling method may have on our estimates of classification accuracy.In this paper, we present different sampling methods and systematically study their impact on evaluation of relational classification. Our results indicate that the choice of sampling method can impact classification performance, and thus consequently affects the accuracy of evaluation.

Network Sampling Designs for Relational Classification

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information