575

Community recovery in non-binary and temporal stochastic block models

Abstract

This article studies the estimation of community memberships from non-binary pair interactions represented by an NN-by-NN tensor whose values are elements of S\mathcal S, where NN is the number of nodes and S\mathcal S is the space of the pairwise interactions between the nodes. As an information-theoretic benchmark, we study data sets generated by a non-binary stochastic block model, and derive fundamental information criteria for the recovery of the community memberships as NN \to \infty. Examples of applications include weighted networks (S=R\mathcal S = \mathbb R), link-labeled networks (S={0,1,,L}(\mathcal S = \{0, 1, \dots, L\}), multiplex networks (S={0,1}M(\mathcal S = \{0,1\}^M) and temporal networks (S={0,1}T\mathcal S = \{0,1\}^T). For temporal interactions, we show that (i) even a small increase in TT may have a big impact on the recovery of community memberships, (ii) consistent recovery is possible even for very sparse data (e.g.\ bounded average degree) when TT is large enough. We also present several estimation algorithms, both offline and online, which fully utilise the temporal nature of the observed data. We analyse the accuracy of the proposed estimation algorithms under various assumptions on data sparsity and identifiability. Numerical experiments show that even a poor initial estimate (e.g., blind random guess) of the community assignment leads to high accuracy obtained by the online algorithm after a small number of iterations, and remarkably so also in very sparse regimes.

View on arXiv
Comments on this paper