221

Interpolating Between Gradient Descent and Exponentiated Gradient Using Reparameterized Gradient Descent

Abstract

Continuous-time mirror descent (CMD) can be seen as the limit case of the discrete-time MD update when the step-size is infinitesimally small. In this paper, we focus on the geometry of the primal and dual CMD updates and introduce a general framework for reparameterizing one CMD update as another. Specifically, the reparameterized update also corresponds to a CMD, but on the composite loss w.r.t. the new variables, and the original variables are obtained via the reparameterization map. We employ these results to introduce a new family of reparameterizations that interpolate between the two commonly used updates, namely the continuous-time gradient descent (GD) and unnormalized exponentiated gradient (EGU), while extending to many other well-known updates. In particular, we show that for the underdetermined linear regression problem, these updates generalize the known behavior of GD and EGU, and provably converge to the minimum L2τ\mathrm{L}_{2-\tau}-norm solution for τ[0,1]\tau\in[0,1]. Our new results also have implications for the regularized training of neural networks to induce sparsity.

View on arXiv
Comments on this paper