Corralling Stochastic Bandit Algorithms

16 June 2020

Papers citing "Corralling Stochastic Bandit Algorithms"

17 / 17 papers shown

Title
A Model Selection Approach for Corruption Robust Reinforcement Learning Chen-Yu Wei Christoph Dann Julian Zimmert 129 45 0 31 Dec 2024
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs Chung-Wei Lee Haipeng Luo Chen-Yu Wei Mengxiao Zhang 170 53 0 14 Jun 2020
Model Selection in Contextual Stochastic Bandit Problems Aldo Pacchiano My Phan Yasin Abbasi-Yadkori Anup B. Rao Julian Zimmert Tor Lattimore Csaba Szepesvári 166 94 0 03 Mar 2020
Model selection for contextual bandits Dylan J. Foster A. Krishnamurthy Haipeng Luo OffRL 164 90 0 03 Jun 2019
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits Niladri S. Chatterji Vidya Muthukumar Peter L. Bartlett 51 44 0 24 May 2019
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously Julian Zimmert Haipeng Luo Chen-Yu Wei 193 81 0 25 Jan 2019
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits Julian Zimmert Yevgeny Seldin AAML 161 179 0 19 Jul 2018
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints Aurélien Garivier Hédi Hadiji Pierre Menard Gilles Stoltz 43 32 0 14 May 2018
More Adaptive Algorithms for Adversarial Bandits Chen-Yu Wei Haipeng Luo 126 182 0 10 Jan 2018
An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits Yevgeny Seldin Gábor Lugosi 69 92 0 20 Feb 2017
Corralling a Band of Bandit Algorithms Alekh Agarwal Haipeng Luo Behnam Neyshabur Robert Schapire 141 157 0 19 Dec 2016
Kernel-based methods for bandit convex optimization Sébastien Bubeck Ronen Eldan Y. Lee 437 166 0 11 Jul 2016
An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits P. Auer Chao-Kai Chiang 69 111 0 27 May 2016
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems Aurélien Garivier Pierre Ménard Gilles Stoltz 46 213 0 23 Feb 2016
Deterministic MDPs with Adversarial Rewards and Bandit Feedback R. Arora O. Dekel Ambuj Tewari 77 31 0 16 Oct 2012
Bandits with heavy tail Sébastien Bubeck Nicolò Cesa-Bianchi Gábor Lugosi 184 290 0 08 Sep 2012
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond Aurélien Garivier Olivier Cappé 166 612 0 12 Feb 2011