572

Community recovery in non-binary and temporal stochastic block models

Abstract

This article studies the estimation of latent community memberships from pairwise interactions in a network of NN nodes, where the observed interactions can be of arbitrary type, including binary, categorical, and vector-valued, and not excluding even more general objects such as time series or spatial point patterns. As a generative model for such data, we introduce a stochastic block model with a general measurable interaction space \cS\cS, for which we derive information-theoretic bounds for the minimum achievable error rate. These bounds yield sharp criteria for the existence of consistent and strongly consistent estimators in terms of data sparsity, statistical similarity between intra- and inter-block interaction distributions, and the shape and size of the interaction space. The general framework makes it possible to study temporal and multiplex networks with \cS={0,1}T\cS = \{0,1\}^T, in settings where both NN \to \infty and TT \to \infty, and the temporal interaction patterns are correlated over time. We present several estimation algorithms, both offline and online, which fully utilise the non-binary nature of the observed data. Numerical experiments show that an online algorithm of low complexity is capable of producing accurate estimates in only a few steps starting from a blind random guess.

View on arXiv
Comments on this paper