KL-UCB-switch: optimal regret bounds for stochastic bandits from both a
distribution-dependent and a distribution-free viewpoints

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

14 May 2018

Aurélien Garivier

Pierre Menard

Papers citing "KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints"

8 / 8 papers shown

Title
On Lai's Upper Confidence Bound in Multi-Armed Bandits Huachen Ren Cun-Hui Zhang 26 1 0 03 Oct 2024
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization D. Tiapkin Evgenii Chzhen Gilles Stoltz 74 1 0 08 Jul 2024
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards Hao Qin Kwang-Sung Jun Chicheng Zhang 32 0 0 28 Apr 2023
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms Denis Belomestny Pierre Menard A. Naumov D. Tiapkin Michal Valko 22 2 0 06 Apr 2023
Top Two Algorithms Revisited Marc Jourdan Rémy Degenne Dorian Baudry R. D. Heide E. Kaufmann 26 38 0 13 Jun 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses D. Tiapkin Denis Belomestny Eric Moulines A. Naumov S. Samsonov Yunhao Tang Michal Valko Pierre Menard 31 17 0 16 May 2022
Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits Junpei Komiyama Edouard Fouché Junya Honda 33 5 0 23 Jul 2021
On Regret with Multiple Best Arms Yinglun Zhu Robert D. Nowak 19 17 0 26 Jun 2020