22
119

Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap

Abstract

In a paper that initiated the modern study of the stochastic block model, Decelle et al., backed by Mossel et al., made the following conjecture: Denote by kk the number of balanced communities, a/na/n the probability of connecting inside communities and b/nb/n across, and set SNR=(ab)2/(k(a+(k1)b)\mathrm{SNR}=(a-b)^2/(k(a+(k-1)b); for any k2k \geq 2, it is possible to detect communities efficiently whenever SNR>1\mathrm{SNR}>1 (the KS threshold), whereas for k4k\geq 4, it is possible to detect communities information-theoretically for some SNR<1\mathrm{SNR}<1. Massouli\é, Mossel et al.\ and Bordenave et al.\ succeeded in proving that the KS threshold is efficiently achievable for k=2k=2, while Mossel et al.\ proved that it cannot be crossed information-theoretically for k=2k=2. The above conjecture remained open for k3k \geq 3. This paper proves this conjecture, further extending the efficient detection to non-symmetrical SBMs with a generalized notion of detection and KS threshold. For the efficient part, a linearized acyclic belief propagation (ABP) algorithm is developed and proved to detect communities for any kk down to the KS threshold in time O(nlogn)O(n \log n). Achieving this requires showing optimality of ABP in the presence of cycles, a challenge for message passing algorithms. The paper further connects ABP to a power iteration method with a nonbacktracking operator of generalized order, formalizing the interplay between message passing and spectral methods. For the information-theoretic (IT) part, a non-efficient algorithm sampling a typical clustering is shown to break down the KS threshold at k=4k=4. The emerging gap is shown to be large in some cases; if a=0a=0, the KS threshold reads bk2b \gtrsim k^2 whereas the IT bound reads bkln(k)b \gtrsim k \ln(k), making the SBM a good study-case for information-computation gaps.

View on arXiv
Comments on this paper