ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1210.1136
  4. Cited By
Kullback-Leibler upper confidence bounds for optimal sequential
  allocation

Kullback-Leibler upper confidence bounds for optimal sequential allocation

3 October 2012
Olivier Cappé
Aurélien Garivier
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
ArXivPDFHTML

Papers citing "Kullback-Leibler upper confidence bounds for optimal sequential allocation"

13 / 13 papers shown
Title
Best-Arm Identification in Unimodal Bandits
Best-Arm Identification in Unimodal Bandits
Riccardo Poiani
Marc Jourdan
E. Kaufmann
Rémy Degenne
93
0
0
04 Nov 2024
On Speeding Up Language Model Evaluation
On Speeding Up Language Model Evaluation
Jin Peng Zhou
Christian K. Belardi
Ruihan Wu
Travis Zhang
Carla P. Gomes
Wen Sun
Kilian Q. Weinberger
76
1
0
08 Jul 2024
Optimal Multi-Fidelity Best-Arm Identification
Optimal Multi-Fidelity Best-Arm Identification
Riccardo Poiani
Rémy Degenne
Emilie Kaufmann
Alberto Maria Metelli
Marcello Restelli
73
1
0
05 Jun 2024
Replicability is Asymptotically Free in Multi-armed Bandits
Replicability is Asymptotically Free in Multi-armed Bandits
Junpei Komiyama
Shinji Ito
Yuichi Yoshida
Souta Koshino
87
1
0
12 Feb 2024
Truncated LinUCB for Stochastic Linear Bandits
Truncated LinUCB for Stochastic Linear Bandits
Yanglei Song
Meng zhou
126
0
0
23 Feb 2022
Maximin Action Identification: A New Bandit Framework for Games
Maximin Action Identification: A New Bandit Framework for Games
Aurélien Garivier
E. Kaufmann
Wouter M. Koolen
34
28
0
15 Feb 2016
Infomax strategies for an optimal balance between exploration and
  exploitation
Infomax strategies for an optimal balance between exploration and exploitation
Gautam Reddy
A. Celani
M. Vergassola
20
17
0
12 Jan 2016
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
81
585
0
18 May 2012
Finite-time Regret Bound of a Bandit Algorithm for the Semi-bounded
  Support Model
Finite-time Regret Bound of a Bandit Algorithm for the Semi-bounded Support Model
Junya Honda
Akimichi Takemura
56
7
0
10 Feb 2012
A Finite-Time Analysis of Multi-armed Bandits Problems with
  Kullback-Leibler Divergences
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
58
146
0
29 May 2011
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
Aurélien Garivier
Olivier Cappé
92
613
0
12 Feb 2011
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Sarah Filippi
Olivier Cappé
Aurélien Garivier
91
105
0
29 Apr 2010
An Asymptotically Optimal Policy for Finite Support Models in the
  Multiarmed Bandit Problem
An Asymptotically Optimal Policy for Finite Support Models in the Multiarmed Bandit Problem
Junya Honda
Akimichi Takemura
65
121
0
17 May 2009
1