23
0

Adversarial Online Learning with Temporal Feedback Graphs

Abstract

We study a variant of prediction with expert advice where the learner's action at round tt is only allowed to depend on losses on a specific subset of the rounds (where the structure of which rounds' losses are visible at time tt is provided by a directed "feedback graph" known to the learner). We present a novel learning algorithm for this setting based on a strategy of partitioning the losses across sub-cliques of this graph. We complement this with a lower bound that is tight in many practical settings, and which we conjecture to be within a constant factor of optimal. For the important class of transitive feedback graphs, we prove that this algorithm is efficiently implementable and obtains the optimal regret bound (up to a universal constant).

View on arXiv
Comments on this paper