ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.03217
  4. Cited By
Inference for Batched Bandits

Inference for Batched Bandits

8 February 2020
Kelly W. Zhang
Lucas Janson
S. Murphy
ArXivPDFHTML

Papers citing "Inference for Batched Bandits"

49 / 49 papers shown
Title
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
69
1
0
22 Feb 2025
A Near-optimal, Scalable and Corruption-tolerant Framework for Stochastic Bandits: From Single-Agent to Multi-Agent and Beyond
A Near-optimal, Scalable and Corruption-tolerant Framework for Stochastic Bandits: From Single-Agent to Multi-Agent and Beyond
Zicheng Hu
Cheng Chen
72
0
0
11 Feb 2025
Logarithmic Neyman Regret for Adaptive Estimation of the Average
  Treatment Effect
Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect
Ojash Neopane
Aaditya Ramdas
Aarti Singh
CML
64
0
0
21 Nov 2024
Off-policy estimation with adaptively collected data: the power of
  online learning
Off-policy estimation with adaptively collected data: the power of online learning
Jeonghwan Lee
Cong Ma
OffRL
81
0
0
19 Nov 2024
Linear Contextual Bandits with Interference
Linear Contextual Bandits with Interference
Yang Xu
Wenbin Lu
Rui Song
27
0
0
24 Sep 2024
MiWaves Reinforcement Learning Algorithm
MiWaves Reinforcement Learning Algorithm
Susobhan Ghosh
Yongyi Guo
Pei-Yao Hung
Lara N. Coughlin
Erin Bonar
Inbal Nahum-Shani
Maureen Walton
S. Murphy
13
1
0
27 Aug 2024
AExGym: Benchmarks and Environments for Adaptive Experimentation
AExGym: Benchmarks and Environments for Adaptive Experimentation
Jimmy Wang
Ethan Che
Daniel R. Jiang
Hongseok Namkoong
44
0
0
08 Aug 2024
Oralytics Reinforcement Learning Algorithm
Oralytics Reinforcement Learning Algorithm
Anna L. Trella
Kelly W. Zhang
Stephanie M Carpenter
David Elashoff
Zara M Greer
Inbal Nahum-Shani
Dennis Ruenger
Vivek Shetty
S. Murphy
20
0
0
19 Jun 2024
Demistifying Inference after Adaptive Experiments
Demistifying Inference after Adaptive Experiments
Aurélien F. Bibaut
Nathan Kallus
30
1
0
02 May 2024
Active Adaptive Experimental Design for Treatment Effect Estimation with
  Covariate Choices
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choices
Masahiro Kato
Akihiro Oga
Wataru Komatsubara
Ryo Inokuchi
49
0
0
06 Mar 2024
Batched Nonparametric Contextual Bandits
Batched Nonparametric Contextual Bandits
Rong Jiang
Cong Ma
OffRL
39
1
0
27 Feb 2024
Best of Three Worlds: Adaptive Experimentation for Digital Marketing in
  Practice
Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice
Tanner Fiez
Houssam Nassif
Yu-Cheng Chen
Sergio Gamez
Lalit P. Jain
16
5
0
16 Feb 2024
An Experimental Design for Anytime-Valid Causal Inference on Multi-Armed
  Bandits
An Experimental Design for Anytime-Valid Causal Inference on Multi-Armed Bandits
Biyonka Liang
Iavor Bojinov
45
5
0
09 Nov 2023
Statistical Limits of Adaptive Linear Models: Low-Dimensional Estimation
  and Inference
Statistical Limits of Adaptive Linear Models: Low-Dimensional Estimation and Inference
Licong Lin
Mufang Ying
Suvrojit Ghosh
K. Khamaru
Cun-Hui Zhang
17
2
0
01 Oct 2023
Optimal Conditional Inference in Adaptive Experiments
Optimal Conditional Inference in Adaptive Experiments
Jiafeng Chen
Isaiah Andrews
23
3
0
21 Sep 2023
Adaptive Linear Estimating Equations
Adaptive Linear Estimating Equations
Mufang Ying
K. Khamaru
Cun-Hui Zhang
25
4
0
14 Jul 2023
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Lei Shi
Jingshen Wang
Tianhao Wu
30
4
0
03 Jul 2023
Optimal tests following sequential experiments
Optimal tests following sequential experiments
Karun Adusumilli
31
2
0
30 Apr 2023
Did we personalize? Assessing personalization by an online reinforcement
  learning algorithm using resampling
Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling
Susobhan Ghosh
Raphael Kim
Prasidh Chhabria
Raaz Dwivedi
Predrag Klasjna
Peng Liao
Kelly Zhang
Susan Murphy
OffRL
24
8
0
11 Apr 2023
Adaptive Experimentation at Scale: A Computational Framework for
  Flexible Batches
Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches
Ethan Che
Hongseok Namkoong
OffRL
51
1
0
21 Mar 2023
Semi-parametric inference based on adaptively collected data
Semi-parametric inference based on adaptively collected data
Licong Lin
K. Khamaru
Martin J. Wainwright
OffRL
39
6
0
05 Mar 2023
Design-Based Inference for Multi-arm Bandits
Design-Based Inference for Multi-arm Bandits
D. Ham
Iavor Bojinov
Michael Lindon
M. Tingley
34
1
0
27 Feb 2023
A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization
A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization
Yasong Feng
Weijian Luo
Yimin Huang
Tianyu Wang
26
8
0
03 Feb 2023
Anytime-valid off-policy inference for contextual bandits
Anytime-valid off-policy inference for contextual bandits
Ian Waudby-Smith
Lili Wu
Aaditya Ramdas
Nikos Karampatziakis
Paul Mineiro
OffRL
43
25
0
19 Oct 2022
Reward Imputation with Sketching for Contextual Batched Bandits
Reward Imputation with Sketching for Contextual Batched Bandits
Xiao Zhang
Ninglu Shao
Zihua Si
Jun Xu
Wen Wang
Hanjing Su
Jirong Wen
OffRL
25
1
0
13 Oct 2022
Entropy Regularization for Population Estimation
Entropy Regularization for Population Estimation
Ben Chugg
Peter Henderson
Jacob Goldin
Daniel E. Ho
28
3
0
24 Aug 2022
Reward Design For An Online Reinforcement Learning Algorithm Supporting
  Oral Self-Care
Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care
Anna L. Trella
Kelly W. Zhang
Inbal Nahum-Shani
Vivek Shetty
Finale Doshi-Velez
S. Murphy
OnRL
21
19
0
15 Aug 2022
Some performance considerations when using multi-armed bandit algorithms
  in the presence of missing data
Some performance considerations when using multi-armed bandit algorithms in the presence of missing data
Xijin Chen
K. M. Lee
S. Villar
D. Robertson
47
1
0
08 May 2022
Integrating Reward Maximization and Population Estimation: Sequential
  Decision-Making for Internal Revenue Service Audit Selection
Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection
Peter Henderson
Ben Chugg
Brandon R. Anderson
Kristen M. Altenburger
Alex Turk
J. Guyton
Jacob Goldin
Daniel E. Ho
OffRL
22
9
0
25 Apr 2022
Algorithms for Adaptive Experiments that Trade-off Statistical Analysis
  with Reward: Combining Uniform Random Assignment and Reward Maximization
Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization
Tong Li
Jacob Nogas
Haochen Song
Harsh Kumar
A. Durand
Anna N. Rafferty
Nina Deliu
S. Villar
Joseph Jay Williams
29
5
0
15 Dec 2021
Safe Data Collection for Offline and Online Policy Learning
Safe Data Collection for Offline and Online Policy Learning
Ruihao Zhu
B. Kveton
OffRL
13
5
0
08 Nov 2021
Efficient Inference Without Trading-off Regret in Bandits: An Allocation
  Probability Test for Thompson Sampling
Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling
Nina Deliu
Joseph Jay Williams
S. Villar
4
10
0
30 Oct 2021
Doubly Robust Interval Estimation for Optimal Policy Evaluation in
  Online Learning
Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online Learning
Ye Shen
Hengrui Cai
Rui Song
OffRL
37
2
0
29 Oct 2021
Lipschitz Bandits with Batched Feedback
Lipschitz Bandits with Batched Feedback
Yasong Feng
Zengfeng Huang
Tianyu Wang
11
14
0
19 Oct 2021
Efficient Online Estimation of Causal Effects by Deciding What to
  Observe
Efficient Online Estimation of Causal Effects by Deciding What to Observe
Shantanu Gupta
Zachary Chase Lipton
David Benjamin Childers
CML
32
18
0
20 Aug 2021
Near-optimal inference in adaptive linear regression
Near-optimal inference in adaptive linear regression
K. Khamaru
Y. Deshpande
Tor Lattimore
Lester W. Mackey
Martin J. Wainwright
27
16
0
05 Jul 2021
A Closer Look at the Worst-case Behavior of Multi-armed Bandit
  Algorithms
A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms
Anand Kalvit
A. Zeevi
25
32
0
03 Jun 2021
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual
  Bandits
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Ruohan Zhan
Vitor Hadad
David A. Hirshberg
Susan Athey
OffRL
9
60
0
03 Jun 2021
From Finite to Countable-Armed Bandits
From Finite to Countable-Armed Bandits
Anand Kalvit
A. Zeevi
22
14
0
22 May 2021
Deeply-Debiased Off-Policy Interval Estimation
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
25
36
0
10 May 2021
Statistical Inference with M-Estimators on Adaptively Collected Data
Statistical Inference with M-Estimators on Adaptively Collected Data
Kelly W. Zhang
Lucas Janson
S. Murphy
OffRL
19
40
0
29 Apr 2021
Challenges in Statistical Analysis of Data Collected by a Bandit
  Algorithm: An Empirical Exploration in Applications to Adaptively Randomized
  Experiments
Challenges in Statistical Analysis of Data Collected by a Bandit Algorithm: An Empirical Exploration in Applications to Adaptively Randomized Experiments
Joseph Jay Williams
Jacob Nogas
Nina Deliu
Hammad Shaikh
S. Villar
A. Durand
Anna N. Rafferty
AAML
22
10
0
22 Mar 2021
Online Multi-Armed Bandits with Adaptive Inference
Online Multi-Armed Bandits with Adaptive Inference
Maria Dimakopoulou
Zhimei Ren
Zhengyuan Zhou
29
34
0
25 Feb 2021
Adaptive Doubly Robust Estimator from Non-stationary Logging Policy
  under a Convergence of Average Probability
Adaptive Doubly Robust Estimator from Non-stationary Logging Policy under a Convergence of Average Probability
Masahiro Kato
OffRL
24
0
0
17 Feb 2021
Weak Signal Asymptotics for Sequentially Randomized Experiments
Weak Signal Asymptotics for Sequentially Randomized Experiments
Xueheng Kuang
Stefan Wager
25
8
0
25 Jan 2021
Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under
  Batch Update Policy
Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under Batch Update Policy
Masahiro Kato
Yusuke Kaneko
OffRL
17
4
0
23 Oct 2020
Optimal Off-Policy Evaluation from Multiple Logging Policies
Optimal Off-Policy Evaluation from Multiple Logging Policies
Nathan Kallus
Yuta Saito
Masatoshi Uehara
OffRL
6
40
0
21 Oct 2020
Panel Experiments and Dynamic Causal Effects: A Finite Population
  Perspective
Panel Experiments and Dynamic Causal Effects: A Finite Population Perspective
Iavor Bojinov
Ashesh Rambachan
N. Shephard
10
48
0
22 Mar 2020
Online Debiasing for Adaptively Collected High-dimensional Data with
  Applications to Time Series Analysis
Online Debiasing for Adaptively Collected High-dimensional Data with Applications to Time Series Analysis
Y. Deshpande
Adel Javanmard
M. Mehrabi
AI4TS
34
31
0
04 Nov 2019
1