174
78

Optimality of Spectral Algorithms for Community Detection in the Labeled Stochastic Block Model

Abstract

We consider the problem of community detection in the labeled Stochastic Block Model (labeled SBM) with a finite number KK of communities of sizes linearly growing with the network size nn. Every pair of nodes is labeled independently at random, and label \ell appears with probability p(i,j,)p(i,j,\ell) between two nodes in community ii and jj, respectively. One observes a realization of these random labels, and the objective is to reconstruct the communities from this observation. Under mild assumptions on the parameters pp, we show that under spectral algorithms, the number of misclassified nodes does not exceed ss with high probability as nn grows large, whenever pˉn=ω(1)\bar{p}n=\omega(1) (where pˉ=maxi,j,1p(i,j,)\bar{p}=\max_{i,j,\ell\ge 1}p(i,j,\ell)), s=o(n)s=o(n) and nD(p)log(n/s)>1\frac{n D(p)}{ \log (n/s)} >1, where D(p)D(p), referred to as the {\it divergence}, is an appropriately defined function of the parameters p=(p(i,j,),i,j,)p=(p(i,j,\ell), i,j, \ell). We further show that nD(p)log(n/s)>1\frac{n D(p)}{ \log (n/s)} >1 is actually necessary to obtain less than ss misclassified nodes asymptotically. This establishes the optimality of spectral algorithms, i.e., when pˉn=ω(1)\bar{p}n=\omega(1) and nD(p)=ω(1)nD(p)=\omega(1), no algorithm can perform better in terms of expected misclassified nodes than spectral algorithms.

View on arXiv
Comments on this paper