ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1210.1136
  4. Cited By
Kullback-Leibler upper confidence bounds for optimal sequential
  allocation

Kullback-Leibler upper confidence bounds for optimal sequential allocation

3 October 2012
Olivier Cappé
Aurélien Garivier
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
ArXivPDFHTML

Papers citing "Kullback-Leibler upper confidence bounds for optimal sequential allocation"

13 / 13 papers shown
Title
Best-Arm Identification in Unimodal Bandits
Best-Arm Identification in Unimodal Bandits
Riccardo Poiani
Marc Jourdan
E. Kaufmann
Rémy Degenne
103
0
0
04 Nov 2024
On Speeding Up Language Model Evaluation
On Speeding Up Language Model Evaluation
Jin Peng Zhou
Christian K. Belardi
Ruihan Wu
Travis Zhang
Carla P. Gomes
Wen Sun
Kilian Q. Weinberger
78
1
0
08 Jul 2024
Optimal Multi-Fidelity Best-Arm Identification
Optimal Multi-Fidelity Best-Arm Identification
Riccardo Poiani
Rémy Degenne
Emilie Kaufmann
Alberto Maria Metelli
Marcello Restelli
80
1
0
05 Jun 2024
Replicability is Asymptotically Free in Multi-armed Bandits
Replicability is Asymptotically Free in Multi-armed Bandits
Junpei Komiyama
Shinji Ito
Yuichi Yoshida
Souta Koshino
91
1
0
12 Feb 2024
Truncated LinUCB for Stochastic Linear Bandits
Truncated LinUCB for Stochastic Linear Bandits
Yanglei Song
Meng zhou
140
0
0
23 Feb 2022
Maximin Action Identification: A New Bandit Framework for Games
Maximin Action Identification: A New Bandit Framework for Games
Aurélien Garivier
E. Kaufmann
Wouter M. Koolen
37
29
0
15 Feb 2016
Infomax strategies for an optimal balance between exploration and
  exploitation
Infomax strategies for an optimal balance between exploration and exploitation
Gautam Reddy
A. Celani
M. Vergassola
26
17
0
12 Jan 2016
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
99
585
0
18 May 2012
Finite-time Regret Bound of a Bandit Algorithm for the Semi-bounded
  Support Model
Finite-time Regret Bound of a Bandit Algorithm for the Semi-bounded Support Model
Junya Honda
Akimichi Takemura
60
7
0
10 Feb 2012
A Finite-Time Analysis of Multi-armed Bandits Problems with
  Kullback-Leibler Divergences
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
65
146
0
29 May 2011
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
Aurélien Garivier
Olivier Cappé
99
613
0
12 Feb 2011
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Sarah Filippi
Olivier Cappé
Aurélien Garivier
96
105
0
29 Apr 2010
An Asymptotically Optimal Policy for Finite Support Models in the
  Multiarmed Bandit Problem
An Asymptotically Optimal Policy for Finite Support Models in the Multiarmed Bandit Problem
Junya Honda
Akimichi Takemura
76
121
0
17 May 2009
1