Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00369
Cited By
v1
v2
v3 (latest)
Batched bandit problems
2 May 2015
Vianney Perchet
Philippe Rigollet
Sylvain Chassang
E. Snowberg
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Batched bandit problems"
50 / 58 papers shown
Title
Optimization of Epsilon-Greedy Exploration
Ethan Che
Hakan Ceylan
James McInerney
Nathan Kallus
44
0
0
03 Jun 2025
Breaking the
log
(
1
/
Δ
2
)
\log(1/\Delta_2)
lo
g
(
1/
Δ
2
)
Barrier: Better Batched Best Arm Identification with Adaptive Grids
Tianyuan Jin
Qin Zhang
Dongruo Zhou
122
0
0
29 Jan 2025
Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality
Antoine Scheid
Aymeric Capitaine
Etienne Boursier
Eric Moulines
Michael I. Jordan
Alain Durmus
201
5
0
28 Jun 2024
Batched Stochastic Bandit for Nondegenerate Functions
Yu Liu
Yunlu Shu
Tianyu Wang
192
0
0
09 May 2024
Generalized Linear Bandits with Limited Adaptivity
Ayush Sawarni
Nirjhar Das
Siddharth Barman
Gaurav Sinha
191
5
0
10 Apr 2024
Batched Nonparametric Contextual Bandits
Rong Jiang
Cong Ma
OffRL
117
1
0
27 Feb 2024
Replicability is Asymptotically Free in Multi-armed Bandits
Junpei Komiyama
Shinji Ito
Yuichi Yoshida
Souta Koshino
174
1
0
12 Feb 2024
Optimal Batched Best Arm Identification
Tianyuan Jin
Yu Yang
Jing Tang
Xiaokui Xiao
Pan Xu
116
3
0
21 Oct 2023
On Collaboration in Distributed Parameter Estimation with Resource Constraints
Y. Chen
Daniel S. Menasché
Don Towsley
72
0
0
12 Jul 2023
From Random Search to Bandit Learning in Metric Measure Spaces
Chuying Han
Yasong Feng
Tianyu Wang
66
2
0
19 May 2023
Balancing Risk and Reward: An Automated Phased Release Strategy
Yufan Li
Jialiang Mao
Iavor Bojinov
51
0
0
16 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
141
8
0
03 Feb 2023
Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits
Nikolai Karpov
Qin Zhang
84
2
0
26 Jan 2023
On Penalization in Stochastic Multi-armed Bandits
Guanhua Fang
P. Li
G. Samorodnitsky
FaML
50
1
0
15 Nov 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
63
10
0
15 Oct 2022
Reward Imputation with Sketching for Contextual Batched Bandits
Xiao Zhang
Ninglu Shao
Zihua Si
Jun Xu
Wen Wang
Hanjing Su
Jirong Wen
OffRL
56
3
0
13 Oct 2022
Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation
Dan Qiao
Yu Wang
OffRL
129
13
0
03 Oct 2022
An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem
Arpit Agarwal
R. Ghuge
V. Nagarajan
71
2
0
25 Sep 2022
Differentially Private Stochastic Linear Bandits: (Almost) for Free
Osama A. Hanna
Antonious M. Girgis
Christina Fragouli
Suhas Diggavi
FedML
80
18
0
07 Jul 2022
Synthetically Controlled Bandits
Vivek Farias
C. Moallemi
Tianyi Peng
Andrew Zheng
95
13
0
14 Feb 2022
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Dan Qiao
Ming Yin
Ming Min
Yu Wang
91
29
0
13 Feb 2022
Stochastic differential equations for limiting description of UCB rule for Gaussian multi-armed bandits
S. Garbar
53
0
0
13 Dec 2021
Solving Multi-Arm Bandit Using a Few Bits of Communication
Osama A. Hanna
Lin F. Yang
Christina Fragouli
80
16
0
11 Nov 2021
Online Learning of Energy Consumption for Navigation of Electric Vehicles
Niklas Åkerblom
Yuxin Chen
M. Chehreghani
41
12
0
03 Nov 2021
The Impact of Batch Learning in Stochastic Bandits
Danil Provodin
Pratik Gajane
Mykola Pechenizkiy
M. Kaptein
OffRL
56
2
0
03 Nov 2021
Federated Linear Contextual Bandits
Ruiquan Huang
Weiqiang Wu
Jing Yang
Cong Shen
FedML
82
78
0
27 Oct 2021
Lipschitz Bandits with Batched Feedback
Yasong Feng
Zengfeng Huang
Tianyu Wang
88
14
0
19 Oct 2021
Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits
Cem Kalkanli
Ayfer Özgür
18
6
0
01 Oct 2021
Dynamic Selection in Algorithmic Decision-making
Jin Li
Ye Luo
Xiaowei Zhang
98
2
0
28 Aug 2021
Batched Thompson Sampling for Multi-Armed Bandits
Nikolai Karpov
Qin Zhang
46
4
0
15 Aug 2021
Debiasing Samples from Online Learning Using Bootstrap
Ningyuan Chen
Xuefeng Gao
Yi Xiong
OffRL
OnRL
52
4
0
31 Jul 2021
Smooth Sequential Optimisation with Delayed Feedback
S. Chennu
Jamie Martin
P. Liyanagama
Phil Mohr
32
2
0
21 Jun 2021
Parallelizing Thompson Sampling
Amin Karbasi
Vahab Mirrokni
M. Shadravan
111
25
0
02 Jun 2021
Batched Neural Bandits
Quanquan Gu
Amin Karbasi
Khashayar Khosravi
Vahab Mirrokni
Dongruo Zhou
BDL
OffRL
62
25
0
25 Feb 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
266
169
0
06 Jan 2021
Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits
Siwei Wang
Longbo Huang
John C. S. Lui
OffRL
91
39
0
05 Nov 2020
A Practical Guide of Off-Policy Evaluation for Bandit Problems
Masahiro Kato
Kenshi Abe
Kaito Ariu
Shota Yasui
OffRL
60
3
0
23 Oct 2020
Dynamic Batch Learning in High-Dimensional Sparse Linear Contextual Bandits
Zhimei Ren
Zhengyuan Zhou
122
31
0
27 Aug 2020
Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design
Yufei Ruan
Jiaqi Yang
Yuanshuo Zhou
OffRL
176
52
0
04 Jul 2020
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits
Manish Raghavan
Aleksandrs Slivkins
Jennifer Wortman Vaughan
Zhiwei Steven Wu
388
18
0
19 May 2020
Collaborative Top Distribution Identifications with Limited Interaction
Nikolai Karpov
Qin Zhang
Yuanshuo Zhou
74
27
0
20 Apr 2020
Sequential Batch Learning in Finite-Action Linear Contextual Bandits
Yanjun Han
Zhengqing Zhou
Zhengyuan Zhou
Jose H. Blanchet
Peter Glynn
Yinyu Ye
OffRL
100
71
0
14 Apr 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability
D. Simchi-Levi
Yunzong Xu
OffRL
441
112
0
28 Mar 2020
Inference for Batched Bandits
Kelly W. Zhang
Lucas Janson
Susan Murphy
105
85
0
08 Feb 2020
Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches
D. Simchi-Levi
Yunzong Xu
Jinglong Zhao
40
2
0
04 Nov 2019
Online Debiasing for Adaptively Collected High-dimensional Data with Applications to Time Series Analysis
Y. Deshpande
Adel Javanmard
M. Mehrabi
AI4TS
132
32
0
04 Nov 2019
Regret Bounds for Batched Bandits
Hossein Esfandiari
Amin Karbasi
Abbas Mehrabian
Vahab Mirrokni
92
63
0
11 Oct 2019
Provably Efficient Q-Learning with Low Switching Cost
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
93
94
0
30 May 2019
Phase Transitions in Bandits with Switching Constraints
D. Simchi-Levi
Yunzong Xu
91
8
0
26 May 2019
Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits
Chao Tao
Qin Zhang
Yuanshuo Zhou
FedML
59
61
0
05 Apr 2019
1
2
Next