CENI: A Hybrid Framework for Efficiently Inferring Information Networks
Keywords:network inference, information network, clustering, survival models
Information links among users or websites drive the development of countless innovative applications. However, in reality, it is easier for us to observe the timestamps when different nodes in the network react on a message, while the connections empowering the information diffusion remain hidden. This motivates recent extensive studies on the network inference problem: how to uncover the edges from the records of messages disseminated through them. Many existing solutions are computationally expensive, which motivates us to develop an efficient two-step framework, Clustering Embedded Network Inference (CENI). CENI integrates clustering strategies to improve the efficiency of network inference. By clustering nodes on the timelines of messages, we propose two naive implementations of CENI: Infection-centric CENI and Cascade-centric CENI. We further point out the ritical dimension problem of CENI: in order to preserve the node structure hidden in the cascades, we need to first project the nodes into an Euclidean space of certain dimension before clustering.By addressing this problem, we propose the third implementation of the CENI framework: Projection-based CENI. Experiments on two real datasets show that the three CENI models only need around 20% ~ 50% of the running time of some state-of-the-art methods, while the inferred edges have similar or slightly better F-measure.