This paper introduces a deterministic Byzantine consensus algorithm that relies on a new weak coordinator. As opposed to previous algorithms that cannot terminate in the presence of a faulty or slow coordinator, our algorithm can terminate even when its coordinator is faulty, hence the name weak coordinator. The key idea is to allow processes to complete asynchronous rounds as soon as they receive a threshold of messages, instead of having to wait for a message from a coordinator that may be slow. The resulting algorithm assumes partial synchrony, is resilience optimal, time optimal and does not need signatures. Our presentation is didactic: we first present a simple safe binary Byzantine consensus algorithm, modify it to ensure termination, and finally present an optimized reduction from multivalue consensus to binary consensus that may terminate in 4 message delays. To evaluate our algorithm, we deployed it on 100 machines distributed in 5 datacenters across different continents and compared its performance against the randomized solution from Mostefaoui, Moumem and Raynal [PODC14] that terminates in O(1) rounds in expectation. Our algorithm always outperforms the latter even in the presence of Byzantine behaviors. Our algorithm has a subsecond average latency in most of our geo-distributed experiments, even when attacked by a well-engineered coalition of Byzantine processes.
View on arXiv