ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.03098
11
14

Better Best of Both Worlds Bounds for Bandits with Switching Costs

7 June 2022
I Zaghloul Amir
Guy Azov
Tomer Koren
Roi Livni
ArXivPDFHTML
Abstract

We study best-of-both-worlds algorithms for bandits with switching cost, recently addressed by Rouyer, Seldin and Cesa-Bianchi, 2021. We introduce a surprisingly simple and effective algorithm that simultaneously achieves minimax optimal regret bound of O(T2/3)\mathcal{O}(T^{2/3})O(T2/3) in the oblivious adversarial setting and a bound of O(min⁡{log⁡(T)/Δ2,T2/3})\mathcal{O}(\min\{\log (T)/\Delta^2,T^{2/3}\})O(min{log(T)/Δ2,T2/3}) in the stochastically-constrained regime, both with (unit) switching costs, where Δ\DeltaΔ is the gap between the arms. In the stochastically constrained case, our bound improves over previous results due to Rouyer et al., that achieved regret of O(T1/3/Δ)\mathcal{O}(T^{1/3}/\Delta)O(T1/3/Δ). We accompany our results with a lower bound showing that, in general, Ω~(min⁡{1/Δ2,T2/3})\tilde{\Omega}(\min\{1/\Delta^2,T^{2/3}\})Ω~(min{1/Δ2,T2/3}) regret is unavoidable in the stochastically-constrained case for algorithms with O(T2/3)\mathcal{O}(T^{2/3})O(T2/3) worst-case regret.

View on arXiv
Comments on this paper