We consider the problem of online forecasting of sequences of length with total-variation at most using observations contaminated by independent -subgaussian noise. We design an -time algorithm that achieves a cumulative square error of with high probability.We also prove a lower bound that matches the upper bound in all parameters (up to a factor). To the best of our knowledge, this is the first \emph{polynomial-time} algorithm that achieves the optimal rate in forecasting total variation bounded sequences and the first algorithm that \emph{adapts to unknown} . Our proof techniques leverage the special localized structure of Haar wavelet basis and the adaptivity to unknown smoothness parameters in the classical wavelet smoothing [Donoho et al., 1998]. We also compare our model to the rich literature of dynamic regret minimization and nonstationary stochastic optimization, where our problem can be treated as a special case. We show that the workhorse in those settings --- online gradient descent and its variants with a fixed restarting schedule --- are instances of a class of \emph{linear forecasters} that require a suboptimal regret of . This implies that the use of more adaptive algorithms is necessary to obtain the optimal rate.
View on arXiv