Inference for Batched Bandits

8 February 2020

Kelly W. Zhang

Papers citing "Inference for Batched Bandits"

49 / 49 papers shown

Title
Statistical Inference in Reinforcement Learning: A Selective Survey Chengchun Shi OffRL 69 1 0 22 Feb 2025
A Near-optimal, Scalable and Corruption-tolerant Framework for Stochastic Bandits: From Single-Agent to Multi-Agent and Beyond Zicheng Hu Cheng Chen 72 0 0 11 Feb 2025
Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect Ojash Neopane Aaditya Ramdas Aarti Singh CML 64 0 0 21 Nov 2024
Off-policy estimation with adaptively collected data: the power of online learning Jeonghwan Lee Cong Ma OffRL 81 0 0 19 Nov 2024
Linear Contextual Bandits with Interference Yang Xu Wenbin Lu Rui Song 27 0 0 24 Sep 2024
MiWaves Reinforcement Learning Algorithm Susobhan Ghosh Yongyi Guo Pei-Yao Hung Lara N. Coughlin Erin Bonar Inbal Nahum-Shani Maureen Walton S. Murphy 13 1 0 27 Aug 2024
AExGym: Benchmarks and Environments for Adaptive Experimentation Jimmy Wang Ethan Che Daniel R. Jiang Hongseok Namkoong 44 0 0 08 Aug 2024
Oralytics Reinforcement Learning Algorithm Anna L. Trella Kelly W. Zhang Stephanie M Carpenter David Elashoff Zara M Greer Inbal Nahum-Shani Dennis Ruenger Vivek Shetty S. Murphy 20 0 0 19 Jun 2024
Demistifying Inference after Adaptive Experiments Aurélien F. Bibaut Nathan Kallus 30 1 0 02 May 2024
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choices Masahiro Kato Akihiro Oga Wataru Komatsubara Ryo Inokuchi 49 0 0 06 Mar 2024
Batched Nonparametric Contextual Bandits Rong Jiang Cong Ma OffRL 39 1 0 27 Feb 2024
Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice Tanner Fiez Houssam Nassif Yu-Cheng Chen Sergio Gamez Lalit P. Jain 16 5 0 16 Feb 2024
An Experimental Design for Anytime-Valid Causal Inference on Multi-Armed Bandits Biyonka Liang Iavor Bojinov 45 5 0 09 Nov 2023
Statistical Limits of Adaptive Linear Models: Low-Dimensional Estimation and Inference Licong Lin Mufang Ying Suvrojit Ghosh K. Khamaru Cun-Hui Zhang 17 2 0 01 Oct 2023
Optimal Conditional Inference in Adaptive Experiments Jiafeng Chen Isaiah Andrews 23 3 0 21 Sep 2023
Adaptive Linear Estimating Equations Mufang Ying K. Khamaru Cun-Hui Zhang 25 4 0 14 Jul 2023
Statistical Inference on Multi-armed Bandits with Delayed Feedback Lei Shi Jingshen Wang Tianhao Wu 30 4 0 03 Jul 2023
Optimal tests following sequential experiments Karun Adusumilli 31 2 0 30 Apr 2023
Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling Susobhan Ghosh Raphael Kim Prasidh Chhabria Raaz Dwivedi Predrag Klasjna Peng Liao Kelly Zhang Susan Murphy OffRL 24 8 0 11 Apr 2023
Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches Ethan Che Hongseok Namkoong OffRL 51 1 0 21 Mar 2023
Semi-parametric inference based on adaptively collected data Licong Lin K. Khamaru Martin J. Wainwright OffRL 39 6 0 05 Mar 2023
Design-Based Inference for Multi-arm Bandits D. Ham Iavor Bojinov Michael Lindon M. Tingley 34 1 0 27 Feb 2023
A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization Yasong Feng Weijian Luo Yimin Huang Tianyu Wang 26 8 0 03 Feb 2023
Anytime-valid off-policy inference for contextual bandits Ian Waudby-Smith Lili Wu Aaditya Ramdas Nikos Karampatziakis Paul Mineiro OffRL 43 25 0 19 Oct 2022
Reward Imputation with Sketching for Contextual Batched Bandits Xiao Zhang Ninglu Shao Zihua Si Jun Xu Wen Wang Hanjing Su Jirong Wen OffRL 25 1 0 13 Oct 2022
Entropy Regularization for Population Estimation Ben Chugg Peter Henderson Jacob Goldin Daniel E. Ho 28 3 0 24 Aug 2022
Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care Anna L. Trella Kelly W. Zhang Inbal Nahum-Shani Vivek Shetty Finale Doshi-Velez S. Murphy OnRL 21 19 0 15 Aug 2022
Some performance considerations when using multi-armed bandit algorithms in the presence of missing data Xijin Chen K. M. Lee S. Villar D. Robertson 47 1 0 08 May 2022
Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection Peter Henderson Ben Chugg Brandon R. Anderson Kristen M. Altenburger Alex Turk J. Guyton Jacob Goldin Daniel E. Ho OffRL 22 9 0 25 Apr 2022
Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization Tong Li Jacob Nogas Haochen Song Harsh Kumar A. Durand Anna N. Rafferty Nina Deliu S. Villar Joseph Jay Williams 29 5 0 15 Dec 2021
Safe Data Collection for Offline and Online Policy Learning Ruihao Zhu B. Kveton OffRL 13 5 0 08 Nov 2021
Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling Nina Deliu Joseph Jay Williams S. Villar 4 10 0 30 Oct 2021
Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online Learning Ye Shen Hengrui Cai Rui Song OffRL 37 2 0 29 Oct 2021
Lipschitz Bandits with Batched Feedback Yasong Feng Zengfeng Huang Tianyu Wang 11 14 0 19 Oct 2021
Efficient Online Estimation of Causal Effects by Deciding What to Observe Shantanu Gupta Zachary Chase Lipton David Benjamin Childers CML 32 18 0 20 Aug 2021
Near-optimal inference in adaptive linear regression K. Khamaru Y. Deshpande Tor Lattimore Lester W. Mackey Martin J. Wainwright 27 16 0 05 Jul 2021
A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms Anand Kalvit A. Zeevi 25 32 0 03 Jun 2021
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits Ruohan Zhan Vitor Hadad David A. Hirshberg Susan Athey OffRL 9 60 0 03 Jun 2021
From Finite to Countable-Armed Bandits Anand Kalvit A. Zeevi 22 14 0 22 May 2021
Deeply-Debiased Off-Policy Interval Estimation C. Shi Runzhe Wan Victor Chernozhukov R. Song OffRL 25 36 0 10 May 2021
Statistical Inference with M-Estimators on Adaptively Collected Data Kelly W. Zhang Lucas Janson S. Murphy OffRL 19 40 0 29 Apr 2021
Challenges in Statistical Analysis of Data Collected by a Bandit Algorithm: An Empirical Exploration in Applications to Adaptively Randomized Experiments Joseph Jay Williams Jacob Nogas Nina Deliu Hammad Shaikh S. Villar A. Durand Anna N. Rafferty AAML 22 10 0 22 Mar 2021
Online Multi-Armed Bandits with Adaptive Inference Maria Dimakopoulou Zhimei Ren Zhengyuan Zhou 29 34 0 25 Feb 2021
Adaptive Doubly Robust Estimator from Non-stationary Logging Policy under a Convergence of Average Probability Masahiro Kato OffRL 24 0 0 17 Feb 2021
Weak Signal Asymptotics for Sequentially Randomized Experiments Xueheng Kuang Stefan Wager 25 8 0 25 Jan 2021
Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under Batch Update Policy Masahiro Kato Yusuke Kaneko OffRL 17 4 0 23 Oct 2020
Optimal Off-Policy Evaluation from Multiple Logging Policies Nathan Kallus Yuta Saito Masatoshi Uehara OffRL 6 40 0 21 Oct 2020
Panel Experiments and Dynamic Causal Effects: A Finite Population Perspective Iavor Bojinov Ashesh Rambachan N. Shephard 10 48 0 22 Mar 2020
Online Debiasing for Adaptively Collected High-dimensional Data with Applications to Time Series Analysis Y. Deshpande Adel Javanmard M. Mehrabi AI4TS 34 31 0 04 Nov 2019