ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.11673
12
0
v1v2 (latest)

Best of Both Worlds: Regret Minimization versus Minimax Play

17 February 2025
Adrian Müller
Jon Schneider
Stratis Skoulakis
Luca Viano
Volkan Cevher
    OffRL
ArXiv (abs)PDFHTML
Main:8 Pages
3 Figures
Bibliography:5 Pages
Appendix:14 Pages
Abstract

In this paper, we investigate the existence of online learning algorithms with bandit feedback that simultaneously guarantee O(1)O(1)O(1) regret compared to a given comparator strategy, and O(T)O(\sqrt{T})O(T​) regret compared to the best strategy in hindsight, where TTT is the number of rounds. We provide the first affirmative answer to this question. In the context of symmetric zero-sum games, both in normal- and extensive form, we show that our results allow us to guarantee to risk at most O(1)O(1)O(1) loss while being able to gain Ω(T)\Omega(T)Ω(T) from exploitable opponents, thereby combining the benefits of both no-regret algorithms and minimax play.

View on arXiv
@article{müller2025_2502.11673,
  title={ Best of Both Worlds: Regret Minimization versus Minimax Play },
  author={ Adrian Müller and Jon Schneider and Stratis Skoulakis and Luca Viano and Volkan Cevher },
  journal={arXiv preprint arXiv:2502.11673},
  year={ 2025 }
}
Comments on this paper