74
53

Distributed Complexity of Large-Scale Graph Computations

Abstract

Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where k2k \geq 2 machines jointly perform computations on graphs with nn nodes (typically, nkn \gg k). Our main contribution is the \emph{General Lower Bound Theorem}, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines for solving a problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of other problems (including non-graph problems). We present two applications by showing (almost) tight bounds for the round complexity of two fundamental graph problems: 1) PageRank: We show a lower bound of Ω~(n/k2)\tilde{\Omega}(n/k^2) rounds, and present a distributed algorithm that computes the PageRank of all the nodes of a graph in O~(n/k2)\tilde{O}(n/k^2) rounds. 2) Triangle enumeration: We show that there exist graphs with mm edges where any distributed algorithm requires Ω~(m/k5/3)\tilde{\Omega}(m/k^{5/3}) rounds. This result implies the first non-trivial lower bound of Ω~(n1/3)\tilde\Omega(n^{1/3}) rounds for the congested clique model. We also present a distributed algorithm that enumerates all the triangles of a graph in O~(m/k5/3+n/k4/3)\tilde{O}(m/k^{5/3} + n/k^{4/3}) rounds.

View on arXiv
Comments on this paper