18
0

Adaptive Online Learning with Varying Norms

Abstract

Given any increasing sequence of norms 0,,T1\|\cdot\|_0,\dots,\|\cdot\|_{T-1}, we provide an online convex optimization algorithm that outputs points wtw_t in some domain WW in response to convex losses t:WR\ell_t:W\to \mathbb{R} that guarantees regret RT(u)=t=1Tt(wt)t(u)O~(uT1t=1Tgtt1,2)R_T(u)=\sum_{t=1}^T \ell_t(w_t)-\ell_t(u)\le \tilde O\left(\|u\|_{T-1}\sqrt{\sum_{t=1}^T \|g_t\|_{t-1,\star}^2}\right) where gtg_t is a subgradient of t\ell_t at wtw_t. Our method does not require tuning to the value of uu and allows for arbitrary convex WW. We apply this result to obtain new "full-matrix"-style regret bounds. Along the way, we provide a new examination of the full-matrix AdaGrad algorithm, suggesting a better learning rate value that improves significantly upon prior analysis. We use our new techniques to tune AdaGrad on-the-fly, realizing our improved bound in a concrete algorithm.

View on arXiv
Comments on this paper