Sparse Matrix Multiplication in the Low-Bandwidth Model

2 March 2022

Abstract

We study matrix multiplication in the low-bandwidth model: There are $n$ computers, and we need to compute the product of two $n \times n$ matrices. Initially computer $i$ knows row $i$ of each input matrix. In one communication round each computer can send and receive one $O(\log n)$ -bit message. Eventually computer $i$ has to output row $i$ of the product matrix. We seek to understand the complexity of this problem in the uniformly sparse case: each row and column of each input matrix has at most $d$ non-zeros and in the product matrix we only need to know the values of at most $d$ elements in each row or column. This is exactly the setting that we have, e.g., when we apply matrix multiplication for triangle detection in graphs of maximum degree $d$ . We focus on the supported setting: the structure of the matrices is known in advance; only the numerical values of nonzero elements are unknown. There is a trivial algorithm that solves the problem in $O(d^2)$ rounds, but for a large $d$ , better algorithms are known to exist; in the moderately dense regime the problem can be solved in $O(dn^{1/3})$ communication rounds, and for very large $d$ , the dominant solution is the fast matrix multiplication algorithm using $O(n^{1.158})$ communication rounds (for matrix multiplication over fields and rings supporting fast matrix multiplication). In this work we show that it is possible to overcome quadratic barrier for all values of $d$ : we present an algorithm that solves the problem in $O(d^{1.907})$ rounds for fields and rings supporting fast matrix multiplication and $O(d^{1.927})$ rounds for semirings, independent of $n$ .

View on arXiv

Comments on this paper