14
0

Annealed Sinkhorn for Optimal Transport: convergence, regularization path and debiasing

Lénaïc Chizat
Abstract

Sinkhorn's algorithm is a method of choice to solve large-scale optimal transport (OT) problems. In this context, it involves an inverse temperature parameter β\beta that determines the speed-accuracy trade-off. To improve this trade-off, practitioners often use a variant of this algorithm, Annealed Sinkhorn, that uses an nondecreasing sequence (βt)tN(\beta_t)_{t\in \mathbb{N}} where tt is the iteration count. However, besides for the schedule βt=Θ(logt)\beta_t=\Theta(\log t) which is impractically slow, it is not known whether this variant is guaranteed to actually solve OT. Our first contribution answers this question: we show that a concave annealing schedule asymptotically solves OT if and only if βt+\beta_t\to+\infty and βtβt10\beta_t-\beta_{t-1}\to 0. The proof is based on an equivalence with Online Mirror Descent and further suggests that the iterates of Annealed Sinkhorn follow the solutions of a sequence of relaxed, entropic OT problems, the regularization path. An analysis of this path reveals that, in addition to the well-known "entropic" error in Θ(βt1)\Theta(\beta^{-1}_t), the annealing procedure induces a "relaxation" error in Θ(βtβt1)\Theta(\beta_{t}-\beta_{t-1}). The best error trade-off is achieved with the schedule βt=Θ(t)\beta_t = \Theta(\sqrt{t}) which, albeit slow, is a universal limitation of this method. Going beyond this limitation, we propose a simple modification of Annealed Sinkhorn that reduces the relaxation error, and therefore enables faster annealing schedules. In toy experiments, we observe the effectiveness of our Debiased Annealed Sinkhorn's algorithm: a single run of this algorithm spans the whole speed-accuracy Pareto front of the standard Sinkhorn's algorithm.

View on arXiv
Comments on this paper