Distributed Frank-Wolfe Algorithm: A Unified Framework for
Communication-Efficient Sparse Learning
- FedML
In this paper, we tackle the problem of learning sparse combinations of elements distributed across a network. We propose and study a distributed Frank-Wolfe (dFW) algorithm that solves this class of problems in a scalable and communication-efficient way. We obtain strong guarantees on the optimization error and communication cost that do not depend on the total number of combining elements. These guarantees are further evidenced by the derivation of a lower-bound on the communication required to construct an approximate solution, showing that dFW is in fact optimal in its dependency on . We evaluate the practical performance of dFW on two problems: kernel SVM and LASSO regression. When using the same amount of communication, dFW is shown to outperform competing methods such as distributed ADMM.
View on arXiv