v1v2v3 (latest)

Batched bandit problems

2 May 2015

Papers citing "Batched bandit problems"

50 / 58 papers shown

Title
Optimization of Epsilon-Greedy Exploration Ethan Che Hakan Ceylan James McInerney Nathan Kallus 46 0 0 03 Jun 2025
$Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids$ Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids Tianyuan Jin Qin Zhang Dongruo Zhou 122 0 0 29 Jan 2025
Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality Antoine Scheid Aymeric Capitaine Etienne Boursier Eric Moulines Michael I. Jordan Alain Durmus 201 5 0 28 Jun 2024
Batched Stochastic Bandit for Nondegenerate Functions Yu Liu Yunlu Shu Tianyu Wang 192 0 0 09 May 2024
Generalized Linear Bandits with Limited Adaptivity Ayush Sawarni Nirjhar Das Siddharth Barman Gaurav Sinha 191 5 0 10 Apr 2024
Batched Nonparametric Contextual Bandits Rong Jiang Cong Ma OffRL 119 1 0 27 Feb 2024
Replicability is Asymptotically Free in Multi-armed Bandits Junpei Komiyama Shinji Ito Yuichi Yoshida Souta Koshino 174 1 0 12 Feb 2024
Optimal Batched Best Arm Identification Tianyuan Jin Yu Yang Jing Tang Xiaokui Xiao Pan Xu 116 3 0 21 Oct 2023
On Collaboration in Distributed Parameter Estimation with Resource Constraints Y. Chen Daniel S. Menasché Don Towsley 72 0 0 12 Jul 2023
From Random Search to Bandit Learning in Metric Measure Spaces Chuying Han Yasong Feng Tianyu Wang 66 2 0 19 May 2023
Balancing Risk and Reward: An Automated Phased Release Strategy Yufan Li Jialiang Mao Iavor Bojinov 51 0 0 16 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback Yunchang Yang Hangshi Zhong Tianhao Wu B. Liu Liwei Wang S. Du OffRL 141 8 0 03 Feb 2023
Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits Nikolai Karpov Qin Zhang 84 2 0 26 Jan 2023
On Penalization in Stochastic Multi-armed Bandits Guanhua Fang P. Li G. Samorodnitsky FaML 50 1 0 15 Nov 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning Zihan Zhang Yuhang Jiang Yuanshuo Zhou Xiangyang Ji OffRL 63 10 0 15 Oct 2022
Reward Imputation with Sketching for Contextual Batched Bandits Xiao Zhang Ninglu Shao Zihua Si Jun Xu Wen Wang Hanjing Su Jirong Wen OffRL 56 3 0 13 Oct 2022
Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation Dan Qiao Yu Wang OffRL 129 13 0 03 Oct 2022
An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem Arpit Agarwal R. Ghuge V. Nagarajan 71 2 0 25 Sep 2022
Differentially Private Stochastic Linear Bandits: (Almost) for Free Osama A. Hanna Antonious M. Girgis Christina Fragouli Suhas Diggavi FedML 80 18 0 07 Jul 2022
Synthetically Controlled Bandits Vivek Farias C. Moallemi Tianyi Peng Andrew Zheng 95 13 0 14 Feb 2022
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost Dan Qiao Ming Yin Ming Min Yu Wang 91 29 0 13 Feb 2022
Stochastic differential equations for limiting description of UCB rule for Gaussian multi-armed bandits S. Garbar 53 0 0 13 Dec 2021
Solving Multi-Arm Bandit Using a Few Bits of Communication Osama A. Hanna Lin F. Yang Christina Fragouli 80 16 0 11 Nov 2021
Online Learning of Energy Consumption for Navigation of Electric Vehicles Niklas Åkerblom Yuxin Chen M. Chehreghani 41 12 0 03 Nov 2021
The Impact of Batch Learning in Stochastic Bandits Danil Provodin Pratik Gajane Mykola Pechenizkiy M. Kaptein OffRL 56 2 0 03 Nov 2021
Federated Linear Contextual Bandits Ruiquan Huang Weiqiang Wu Jing Yang Cong Shen FedML 82 78 0 27 Oct 2021
Lipschitz Bandits with Batched Feedback Yasong Feng Zengfeng Huang Tianyu Wang 88 14 0 19 Oct 2021
Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits Cem Kalkanli Ayfer Özgür 18 6 0 01 Oct 2021
Dynamic Selection in Algorithmic Decision-making Jin Li Ye Luo Xiaowei Zhang 98 2 0 28 Aug 2021
Batched Thompson Sampling for Multi-Armed Bandits Nikolai Karpov Qin Zhang 46 4 0 15 Aug 2021
Debiasing Samples from Online Learning Using Bootstrap Ningyuan Chen Xuefeng Gao Yi Xiong OffRL OnRL 52 4 0 31 Jul 2021
Smooth Sequential Optimisation with Delayed Feedback S. Chennu Jamie Martin P. Liyanagama Phil Mohr 32 2 0 21 Jun 2021
Parallelizing Thompson Sampling Amin Karbasi Vahab Mirrokni M. Shadravan 111 25 0 02 Jun 2021
Batched Neural Bandits Quanquan Gu Amin Karbasi Khashayar Khosravi Vahab Mirrokni Dongruo Zhou BDL OffRL 62 25 0 25 Feb 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints Chi Jin Zhuoran Yang Zhaoran Wang OffRL 266 169 0 06 Jan 2021
Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits Siwei Wang Longbo Huang John C. S. Lui OffRL 91 39 0 05 Nov 2020
A Practical Guide of Off-Policy Evaluation for Bandit Problems Masahiro Kato Kenshi Abe Kaito Ariu Shota Yasui OffRL 60 3 0 23 Oct 2020
Dynamic Batch Learning in High-Dimensional Sparse Linear Contextual Bandits Zhimei Ren Zhengyuan Zhou 122 31 0 27 Aug 2020
Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design Yufei Ruan Jiaqi Yang Yuanshuo Zhou OffRL 176 52 0 04 Jul 2020
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits Manish Raghavan Aleksandrs Slivkins Jennifer Wortman Vaughan Zhiwei Steven Wu 388 18 0 19 May 2020
Collaborative Top Distribution Identifications with Limited Interaction Nikolai Karpov Qin Zhang Yuanshuo Zhou 74 27 0 20 Apr 2020
Sequential Batch Learning in Finite-Action Linear Contextual Bandits Yanjun Han Zhengqing Zhou Zhengyuan Zhou Jose H. Blanchet Peter Glynn Yinyu Ye OffRL 100 71 0 14 Apr 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability D. Simchi-Levi Yunzong Xu OffRL 441 112 0 28 Mar 2020
Inference for Batched Bandits Kelly W. Zhang Lucas Janson Susan Murphy 105 85 0 08 Feb 2020
Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches D. Simchi-Levi Yunzong Xu Jinglong Zhao 40 2 0 04 Nov 2019
Online Debiasing for Adaptively Collected High-dimensional Data with Applications to Time Series Analysis Y. Deshpande Adel Javanmard M. Mehrabi AI4TS 132 32 0 04 Nov 2019
Regret Bounds for Batched Bandits Hossein Esfandiari Amin Karbasi Abbas Mehrabian Vahab Mirrokni 92 63 0 11 Oct 2019
Provably Efficient Q-Learning with Low Switching Cost Yu Bai Tengyang Xie Nan Jiang Yu Wang 93 94 0 30 May 2019
Phase Transitions in Bandits with Switching Constraints D. Simchi-Levi Yunzong Xu 91 8 0 26 May 2019
Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits Chao Tao Qin Zhang Yuanshuo Zhou FedML 59 61 0 05 Apr 2019