Distributed Optimization Based on Gradient-tracking Revisited: Enhancing Convergence Rate via Surrogation

We study distributed multiagent optimization over (directed, time-varying) graphs. We consider the minimization of subject to convex constraints, where is the smooth strongly convex sum of the agent's losses and is a nonsmooth convex function. We build on the SONATA algorithm: the algorithm employs the use of surrogate objective functions in the agents' subproblems (going thus beyond linearization, such as proximal-gradient) coupled with a perturbed (push-sum) consensus mechanism that aims to track locally the gradient of . SONATA achieves precision on the objective value in gradient computations at each node and communication steps, where is the condition number of and characterizes the connectivity of the network. This is the first linear rate result for distributed composite optimization; it also improves on existing (non-accelerated) schemes just minimizing , whose rate depends on much larger quantities than (e.g., the worst-case condition number among the agents). When considering in particular empirical risk minimization problems with statistically similar data across the agents, SONATA employing high-order surrogates achieves precision in iterations and communication steps, where measures the degree of similarity of the agents' losses and is the strong convexity constant of . Therefore, when , the use of high-order surrogates yields provably faster rates than what achievable by first-order models; this is without exchanging any Hessian matrix over the network.
View on arXiv