Estimating the Number of Induced Subgraphs from Incomplete Data and Neighborhood Queries
Keywords:Graph Mining, Social Network Analysis & Community
AbstractWe consider a natural setting where network parameters are estimated from noisy and incomplete information about the network. More specifically, we investigate how we can efficiently estimate the number of small subgraphs (e.g., edges, triangles, etc.) based on full access to one or two noisy and incomplete samples of a large underlying network and on few queries revealing the neighborhood of carefully selected vertices. After specifying a random generator which removes edges from the underlying graph, we present estimators with strong provable performance guarantees, which exploit information from the noisy network samples and query a constant number of the most important vertices for the estimation. Our experimental evaluation shows that, in practice, a single noisy network sample and a couple of hundreds neighborhood queries suffice for accurately estimating the number of triangles in networks with millions of vertices and edges.
How to Cite
Fotakis, D., Pittas, T., & Skoulakis, S. (2021). Estimating the Number of Induced Subgraphs from Incomplete Data and Neighborhood Queries. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4045-4053. https://doi.org/10.1609/aaai.v35i5.16525
AAAI Technical Track on Data Mining and Knowledge Management