We address the problem of Byzantine collaborative learning: a set of nodes seek to collectively learn from each others' local data. The data distribution may vary from one node to another. No node is trusted and nodes can behave arbitrarily, i.e., they can be Byzantine. We prove that collaborative learning is equivalent to a new and weak form of agreement, which we call averaging agreement. In this problem, nodes start each with an initial vector and seek to approximately agree on a common vector, which is close to the average of honest nodes' initial vectors. More precisely, the "error" must remain within a multiplicative constant (which we call averaging constant) of the maximum distance between the honest nodes' initial vectors. Essentially, the smaller the averaging constant, the better the learning. We present two asynchronous solutions to averaging agreement, each we prove optimal according to some dimension. The first, based on the minimum-diameter averaging, requires $ n \geq 6f+1$, but achieves asymptotically the best-possible averaging constant up to a multiplicative constant. The second, based on reliable broadcast and coordinate-wise trimmed mean, achieves optimal Byzantine resilience, i.e., . Each of these algorithms induces an optimal collaborative learning protocol.
View on arXiv