Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1307.3400
Cited By
Thompson Sampling for 1-Dimensional Exponential Family Bandits
12 July 2013
N. Korda
E. Kaufmann
Rémi Munos
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Thompson Sampling for 1-Dimensional Exponential Family Bandits"
50 / 85 papers shown
Title
Communication Bounds for the Distributed Experts Problem
Zhihao Jia
Qi Pang
Trung Tran
David Woodruff
Zhihao Zhang
Wenting Zheng
68
0
0
06 Jan 2025
On Lai's Upper Confidence Bound in Multi-Armed Bandits
Huachen Ren
Cun-Hui Zhang
34
1
0
03 Oct 2024
Multi-Armed Bandits with Abstention
Junwen Yang
Tianyuan Jin
Vincent Y. F. Tan
36
0
0
23 Feb 2024
Diffusion Models Meet Contextual Bandits with Large Action Spaces
Imad Aouali
DiffM
34
4
0
15 Feb 2024
Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs
Tianyuan Jin
Hao-Lun Hsu
William Chang
Pan Xu
20
1
0
24 Dec 2023
Approximate information maximization for bandit games
A. Barbier–Chebbah
Christian L. Vestergaard
Jean-Baptiste Masson
Etienne Boursier
21
0
0
19 Oct 2023
From Bandits Model to Deep Deterministic Policy Gradient, Reinforcement Learning with Contextual Information
Zhendong Shi
Xiaoli Wei
E. Kuruoglu
OffRL
18
0
0
01 Oct 2023
Thompson Exploration with Best Challenger Rule in Best Arm Identification
Jongyeong Lee
Junya Honda
Masashi Sugiyama
33
3
0
01 Oct 2023
Monte-Carlo tree search with uncertainty propagation via optimal transport
Tuan Dam
Pascal Stenger
Lukas Schneider
Joni Pajarinen
Carlo DÉramo
Odalric-Ambrym Maillard
24
1
0
19 Sep 2023
Generalized Regret Analysis of Thompson Sampling using Fractional Posteriors
Prateek Jaiswal
D. Pati
A. Bhattacharya
Bani Mallick
11
0
0
12 Sep 2023
Optimal Best-Arm Identification in Bandits with Access to Offline Data
Shubhada Agrawal
Sandeep Juneja
Karthikeyan Shanmugam
A. Suggala
37
5
0
15 Jun 2023
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards
Hao Qin
Kwang-Sung Jun
Chicheng Zhang
46
0
0
28 Apr 2023
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms
Denis Belomestny
Pierre Menard
A. Naumov
D. Tiapkin
Michal Valko
22
2
0
06 Apr 2023
A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms
Dorian Baudry
Kazuya Suzuki
Junya Honda
34
4
0
10 Mar 2023
The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models
Jongyeong Lee
Chao-Kai Chiang
Masashi Sugiyama
26
0
0
28 Feb 2023
Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits
Jongyeong Lee
Junya Honda
Chao-Kai Chiang
Masashi Sugiyama
34
4
0
03 Feb 2023
The Typical Behavior of Bandit Algorithms
Lin Fan
Peter Glynn
15
2
0
11 Oct 2022
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration and Planning
Reda Ouhamma
D. Basu
Odalric-Ambrym Maillard
OffRL
29
10
0
05 Oct 2022
Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits
Tianyuan Jin
Pan Xu
X. Xiao
Anima Anandkumar
49
12
0
07 Jun 2022
Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization
Shouri Hu
Haowei Wang
Zhongxiang Dai
K. H. Low
Szu Hui Ng
33
4
0
10 May 2022
Multi-armed bandits for resource efficient, online optimization of language model pre-training: the use case of dynamic masking
Iñigo Urteaga
Moulay Draidia
Tomer Lancewicki
Shahram Khadivi
AI4CE
29
1
0
24 Mar 2022
Thompson Sampling on Asymmetric
α
α
α
-Stable Bandits
Zhendong Shi
E. Kuruoglu
Xiaoli Wei
16
0
0
19 Mar 2022
Bregman Deviations of Generic Exponential Families
Sayak Ray Chowdhury
Patrick Saux
Odalric-Ambrym Maillard
Aditya Gopalan
17
12
0
18 Jan 2022
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits
Dorian Baudry
Patrick Saux
Odalric-Ambrym Maillard
17
7
0
18 Nov 2021
Maillard Sampling: Boltzmann Exploration Done Optimally
Jieming Bian
Kwang-Sung Jun
27
12
0
05 Nov 2021
Batched Thompson Sampling
Cem Kalkanli
Ayfer Özgür
OffRL
62
19
0
01 Oct 2021
Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits
Cem Kalkanli
Ayfer Özgür
6
6
0
01 Oct 2021
The Fragility of Optimized Bandit Algorithms
Lin Fan
Peter Glynn
24
13
0
28 Sep 2021
Extreme Bandits using Robust Statistics
Sujay Bhatt
Ping Li
G. Samorodnitsky
30
7
0
09 Sep 2021
Asymptotically Optimal Bandits under Weighted Information
Matias I. Müller
C. Rojas
15
0
0
28 May 2021
No-Regret Reinforcement Learning with Heavy-Tailed Rewards
Vincent Zhuang
Yanan Sui
233
11
0
25 Feb 2021
Optimal Thompson Sampling strategies for support-aware CVaR bandits
Dorian Baudry
Romain Gautron
E. Kaufmann
Odalric-Ambrym Maillard
17
33
0
10 Dec 2020
Sub-sampling for Efficient Non-Parametric Bandit Exploration
Dorian Baudry
E. Kaufmann
Odalric-Ambrym Maillard
9
13
0
27 Oct 2020
Stochastic Bandits with Vector Losses: Minimizing
ℓ
∞
\ell^\infty
ℓ
∞
-Norm of Relative Losses
Xuedong Shang
Han Shao
Jian Qian
24
0
0
15 Oct 2020
Cooperative Multi-Agent Bandits with Heavy Tails
Abhimanyu Dubey
Alex Pentland
8
48
0
14 Aug 2020
Lenient Regret for Multi-Armed Bandits
Nadav Merlis
Shie Mannor
16
7
0
10 Aug 2020
Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring
Taira Tsuchiya
Junya Honda
Masashi Sugiyama
10
8
0
17 Jun 2020
Scalable Thompson Sampling using Sparse Gaussian Process Models
Sattar Vakili
Henry B. Moss
A. Artemev
Vincent Dutordoir
Victor Picheny
15
34
0
09 Jun 2020
Non-Stationary Delayed Bandits with Intermediate Observations
Claire Vernade
András Gyorgy
Timothy A. Mann
OffRL
6
0
0
03 Jun 2020
MOTS: Minimax Optimal Thompson Sampling
Tianyuan Jin
Pan Xu
Jieming Shi
Xiaokui Xiao
Quanquan Gu
39
30
0
03 Mar 2020
On Thompson Sampling with Langevin Algorithms
Eric Mazumdar
Aldo Pacchiano
Yi Ma
Peter L. Bartlett
Michael I. Jordan
14
11
0
23 Feb 2020
Double Explore-then-Commit: Asymptotic Optimality and Beyond
Tianyuan Jin
Pan Xu
Xiaokui Xiao
Quanquan Gu
41
26
0
21 Feb 2020
Bayesian Meta-Prior Learning Using Empirical Bayes
Sareh Nabi
Houssam Nassif
Joseph Hong
H. Mamani
Guido Imbens
24
18
0
04 Feb 2020
Better Boosting with Bandits for Online Learning
N. Nikolaou
J. Mellor
N. Oza
Gavin Brown
20
0
0
16 Jan 2020
On Thompson Sampling for Smoother-than-Lipschitz Bandits
James A. Grant
David S. Leslie
20
7
0
08 Jan 2020
Automatic Ensemble Learning for Online Influence Maximization
Xiaojin Zhang
9
1
0
25 Nov 2019
Optimal UCB Adjustments for Large Arm Sizes
H. Chan
Shouri Hu
13
1
0
05 Sep 2019
Thompson Sampling on Symmetric
α
α
α
-Stable Bandits
Abhimanyu Dubey
Alex Pentland
30
6
0
08 Jul 2019
Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits
Xi Liu
Ping-Chun Hsieh
A. Bhattacharya
P. R. Kumar
16
0
0
02 Jul 2019
Bootstrapping Upper Confidence Bound
Botao Hao
Yasin Abbasi-Yadkori
Zheng Wen
Guang Cheng
11
52
0
12 Jun 2019
1
2
Next