98
12

Online Optimization with Feedback Delay and Nonlinear Switching Cost

Abstract

We study a variant of online optimization in which the learner receives kk-round delayed feedback\textit{delayed feedback} about hitting cost and there is a multi-step nonlinear switching cost, i.e., costs depend on multiple previous actions in a nonlinear manner. Our main result shows that a novel Iterative Regularized Online Balanced Descent (iROBD) algorithm has a constant, dimension-free competitive ratio that is O(L2k)O(L^{2k}), where LL is the Lipschitz constant of the switching cost. Additionally, we provide lower bounds that illustrate the Lipschitz condition is required and the dependencies on kk and LL are tight. Finally, via reductions, we show that this setting is closely related to online control problems with delay, nonlinear dynamics, and adversarial disturbances, where iROBD directly offers constant-competitive online policies.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.