Online Non-Convex Learning: Following the Perturbed Leader is Optimal

Abstract
We study the problem of online learning with non-convex losses, where the learner has access to an offline optimization oracle. We show that the classical Follow the Perturbed Leader (FTPL) algorithm achieves optimal regret rate of in this setting. This improves upon the previous best-known regret rate of for FTPL. We further show that an optimistic variant of FTPL achieves better regret bounds when the sequence of losses encountered by the learner is `predictable'.
View on arXivComments on this paper