Annealed Sinkhorn for Optimal Transport: convergence, regularization path and debiasing

21 August 2024

Lénaïc Chizat

Abstract

Sinkhorn's algorithm is a method of choice to solve large-scale optimal transport (OT) problems. In this context, it involves an inverse temperature parameter $\beta$ that determines the speed-accuracy trade-off. To improve this trade-off, practitioners often use a variant of this algorithm, Annealed Sinkhorn, that uses an nondecreasing sequence $(\beta_t)_{t\in \mathbb{N}}$ where $t$ is the iteration count. However, besides for the schedule $\beta_t=\Theta(\log t)$ which is impractically slow, it is not known whether this variant is guaranteed to actually solve OT. Our first contribution answers this question: we show that a concave annealing schedule asymptotically solves OT if and only if $\beta_t\to+\infty$ and $\beta_t-\beta_{t-1}\to 0$ . The proof is based on an equivalence with Online Mirror Descent and further suggests that the iterates of Annealed Sinkhorn follow the solutions of a sequence of relaxed, entropic OT problems, the regularization path. An analysis of this path reveals that, in addition to the well-known "entropic" error in $\Theta(\beta^{-1}_t)$ , the annealing procedure induces a "relaxation" error in $\Theta(\beta_{t}-\beta_{t-1})$ . The best error trade-off is achieved with the schedule $\beta_t = \Theta(\sqrt{t})$ which, albeit slow, is a universal limitation of this method. Going beyond this limitation, we propose a simple modification of Annealed Sinkhorn that reduces the relaxation error, and therefore enables faster annealing schedules. In toy experiments, we observe the effectiveness of our Debiased Annealed Sinkhorn's algorithm: a single run of this algorithm spans the whole speed-accuracy Pareto front of the standard Sinkhorn's algorithm.

View on arXiv

Comments on this paper