ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.15383
25
8

On the Minimax Regret for Online Learning with Feedback Graphs

24 May 2023
Khaled Eldowa
Emmanuel Esposito
Tommaso Cesari
Nicolò Cesa-Bianchi
ArXivPDFHTML
Abstract

In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is O(αTln⁡K)\mathcal{O}\bigl(\sqrt{\alpha T\ln K}\bigr)O(αTlnK​), where KKK is the number of actions, α\alphaα is the independence number of the graph, and TTT is the time horizon. The ln⁡K\sqrt{\ln K}lnK​ factor is known to be necessary when α=1\alpha = 1α=1 (the experts case). On the other hand, when α=K\alpha = Kα=K (the bandits case), the minimax rate is known to be Θ(KT)\Theta\bigl(\sqrt{KT}\bigr)Θ(KT​), and a lower bound Ω(αT)\Omega\bigl(\sqrt{\alpha T}\bigr)Ω(αT​) is known to hold for any α\alphaα. Our improved upper bound O(αT(1+ln⁡(K/α)))\mathcal{O}\bigl(\sqrt{\alpha T(1+\ln(K/\alpha))}\bigr)O(αT(1+ln(K/α))​) holds for any α\alphaα and matches the lower bounds for bandits and experts, while interpolating intermediate cases. To prove this result, we use FTRL with qqq-Tsallis entropy for a carefully chosen value of q∈[1/2,1)q \in [1/2, 1)q∈[1/2,1) that varies with α\alphaα. The analysis of this algorithm requires a new bound on the variance term in the regret. We also show how to extend our techniques to time-varying graphs, without requiring prior knowledge of their independence numbers. Our upper bound is complemented by an improved Ω(αT(ln⁡K)/(ln⁡α))\Omega\bigl(\sqrt{\alpha T(\ln K)/(\ln\alpha)}\bigr)Ω(αT(lnK)/(lnα)​) lower bound for all α>1\alpha > 1α>1, whose analysis relies on a novel reduction to multitask learning. This shows that a logarithmic factor is necessary as soon as α<K\alpha < Kα<K.

View on arXiv
Comments on this paper