44
3

Minimax Regret for Bandit Convex Optimisation of Ridge Functions

Abstract

We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form f(x)=g(x,θ)f(x) = g(\langle x, \theta\rangle) for convex g:RRg : \mathbb R \to \mathbb R and θRd\theta \in \mathbb R^d. We provide a short information-theoretic proof that the minimax regret is at most O(dnlog(diamK))O(d\sqrt{n} \log(\operatorname{diam}\mathcal K)) where nn is the number of interactions, dd the dimension and diam(K)\operatorname{diam}(\mathcal K) is the diameter of the constraint set. Hence, this class of functions is at most logarithmically harder than the linear case.

View on arXiv
Comments on this paper