Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1210.1136
Cited By
Kullback-Leibler upper confidence bounds for optimal sequential allocation
3 October 2012
Olivier Cappé
Aurélien Garivier
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Kullback-Leibler upper confidence bounds for optimal sequential allocation"
13 / 13 papers shown
Title
Best-Arm Identification in Unimodal Bandits
Riccardo Poiani
Marc Jourdan
E. Kaufmann
Rémy Degenne
93
0
0
04 Nov 2024
On Speeding Up Language Model Evaluation
Jin Peng Zhou
Christian K. Belardi
Ruihan Wu
Travis Zhang
Carla P. Gomes
Wen Sun
Kilian Q. Weinberger
76
1
0
08 Jul 2024
Optimal Multi-Fidelity Best-Arm Identification
Riccardo Poiani
Rémy Degenne
Emilie Kaufmann
Alberto Maria Metelli
Marcello Restelli
73
1
0
05 Jun 2024
Replicability is Asymptotically Free in Multi-armed Bandits
Junpei Komiyama
Shinji Ito
Yuichi Yoshida
Souta Koshino
87
1
0
12 Feb 2024
Truncated LinUCB for Stochastic Linear Bandits
Yanglei Song
Meng zhou
126
0
0
23 Feb 2022
Maximin Action Identification: A New Bandit Framework for Games
Aurélien Garivier
E. Kaufmann
Wouter M. Koolen
34
28
0
15 Feb 2016
Infomax strategies for an optimal balance between exploration and exploitation
Gautam Reddy
A. Celani
M. Vergassola
20
17
0
12 Jan 2016
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
81
585
0
18 May 2012
Finite-time Regret Bound of a Bandit Algorithm for the Semi-bounded Support Model
Junya Honda
Akimichi Takemura
56
7
0
10 Feb 2012
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
58
146
0
29 May 2011
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
Aurélien Garivier
Olivier Cappé
92
613
0
12 Feb 2011
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Sarah Filippi
Olivier Cappé
Aurélien Garivier
91
105
0
29 Apr 2010
An Asymptotically Optimal Policy for Finite Support Models in the Multiarmed Bandit Problem
Junya Honda
Akimichi Takemura
65
121
0
17 May 2009
1