Estimating the Number of Induced Subgraphs from Incomplete Data and Neighborhood Queries

Authors

  • Dimitris Fotakis National Technical University of Athens
  • Thanasis Pittas University of Wisconsin-Madison
  • Stratis Skoulakis Singapore University of Technology and Design

Keywords:

Graph Mining, Social Network Analysis & Community

Abstract

We consider a natural setting where network parameters are estimated from noisy and incomplete information about the network. More specifically, we investigate how we can efficiently estimate the number of small subgraphs (e.g., edges, triangles, etc.) based on full access to one or two noisy and incomplete samples of a large underlying network and on few queries revealing the neighborhood of carefully selected vertices. After specifying a random generator which removes edges from the underlying graph, we present estimators with strong provable performance guarantees, which exploit information from the noisy network samples and query a constant number of the most important vertices for the estimation. Our experimental evaluation shows that, in practice, a single noisy network sample and a couple of hundreds neighborhood queries suffice for accurately estimating the number of triangles in networks with millions of vertices and edges.

Downloads

Published

2021-05-18

How to Cite

Fotakis, D., Pittas, T., & Skoulakis, S. (2021). Estimating the Number of Induced Subgraphs from Incomplete Data and Neighborhood Queries. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4045-4053. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16525

Issue

Section

AAAI Technical Track on Data Mining and Knowledge Management