Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.01205
Cited By
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
4 December 2016
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimal and Adaptive Off-policy Evaluation in Contextual Bandits"
50 / 51 papers shown
Title
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
42
0
0
02 May 2025
Cross-Validated Off-Policy Evaluation
Matej Cief
Branislav Kveton
Michal Kompan
OffRL
33
1
0
24 May 2024
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It
Yuta Saito
Masahiro Nomura
OffRL
55
2
0
23 Apr 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
41
5
0
22 Feb 2024
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
Haruka Kiyohara
Masahiro Nomura
Yuta Saito
27
6
0
03 Feb 2024
Individualized Policy Evaluation and Learning under Clustered Network Interference
Yi Zhang
Kosuke Imai
OffRL
42
1
0
04 Nov 2023
Distributional Off-Policy Evaluation for Slate Recommendations
Shreyas Chaudhari
David Arbour
Georgios Theocharous
N. Vlassis
OffRL
46
0
0
27 Aug 2023
Online learning in bandits with predicted context
Yongyi Guo
Ziping Xu
Susan Murphy
26
4
0
26 Jul 2023
Balanced Off-Policy Evaluation for Personalized Pricing
Adam N. Elmachtoub
Vishal Gupta
Yunfan Zhao
OffRL
42
6
0
24 Feb 2023
SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits
Subhojyoti Mukherjee
Qiaomin Xie
Josiah P. Hanna
R. Nowak
OffRL
58
5
0
29 Jan 2023
Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency
Wenlong Mou
Peng Ding
Martin J. Wainwright
Peter L. Bartlett
OffRL
40
10
0
16 Jan 2023
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
46
69
0
13 Dec 2022
Bayesian Counterfactual Mean Embeddings and Off-Policy Evaluation
Diego Martinez-Taboada
Dino Sejdinovic
CML
OffRL
27
0
0
02 Nov 2022
Deploying a Steered Query Optimizer in Production at Microsoft
Wangda Zhang
Matteo Interlandi
Paul Mineiro
S. Qiao
Nasim Ghazanfari
Marc T. Friedman
Rafah Hosn
Hiren Patel
Alekh Jindal
28
23
0
24 Oct 2022
Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency
Wenlong Mou
Martin J. Wainwright
Peter L. Bartlett
OffRL
43
11
0
26 Sep 2022
Fast Offline Policy Optimization for Large Scale Recommendation
Otmane Sakhi
D. Rohde
Alexandre Gilotte
OffRL
50
3
0
08 Aug 2022
Uncertainty Quantification for Fairness in Two-Stage Recommender Systems
Lequn Wang
Thorsten Joachims
30
22
0
30 May 2022
Safe Exploration for Efficient Policy Evaluation and Comparison
Runzhe Wan
Branislav Kveton
Rui Song
OffRL
38
10
0
26 Feb 2022
Off-Policy Evaluation Using Information Borrowing and Context-Based Switching
Sutanoy Dasgupta
Yabo Niu
Kishan Panaganti
D. Kalathil
D. Pati
Bani Mallick
OffRL
31
0
0
18 Dec 2021
Safe Data Collection for Offline and Online Policy Learning
Ruihao Zhu
Branislav Kveton
OffRL
21
5
0
08 Nov 2021
Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability
Yupeng Tang
Seung-seob Lee
OffRL
59
22
0
24 Oct 2021
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Noriyuki Kojima
Alane Suhr
Yoav Artzi
30
24
0
10 Aug 2021
Bandit Algorithms for Precision Medicine
Yangyi Lu
Ziping Xu
Ambuj Tewari
66
11
0
10 Aug 2021
Evaluating the progress of Deep Reinforcement Learning in the real world: aligning domain-agnostic and domain-specific research
J. Luis
E. Crawley
B. Cameron
OffRL
27
6
0
07 Jul 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
36
19
0
13 May 2021
Policy Learning with Adaptively Collected Data
Ruohan Zhan
Zhimei Ren
Susan Athey
Zhengyuan Zhou
OffRL
45
27
0
05 May 2021
Off-Policy Risk Assessment in Contextual Bandits
Audrey Huang
Liu Leqi
Zachary Chase Lipton
Kamyar Azizzadenesheli
OffRL
32
36
0
18 Apr 2021
Benchmarks for Deep Off-Policy Evaluation
Justin Fu
Mohammad Norouzi
Ofir Nachum
George Tucker
Ziyun Wang
...
Yutian Chen
Aviral Kumar
Cosmin Paduraru
Sergey Levine
T. Paine
ELM
OffRL
35
100
0
30 Mar 2021
Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham Kakade
OffRL
22
42
0
08 Mar 2021
Estimating Average Treatment Effects via Orthogonal Regularization
Tobias Hatt
Stefan Feuerriegel
CML
164
35
0
21 Jan 2021
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
24
73
0
17 Aug 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
44
31
0
07 Jul 2020
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
38
75
0
16 Jun 2020
Batch Stationary Distribution Estimation
Junfeng Wen
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
22
22
0
02 Mar 2020
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Ming Yin
Yu Wang
OffRL
29
80
0
29 Jan 2020
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
32
152
0
15 Nov 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
30
67
0
16 Oct 2019
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
Xinyun Chen
Lu Wang
Yizhe Hang
Heng Ge
H. Zha
OffRL
14
5
0
10 Oct 2019
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
43
183
0
22 Aug 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
13
328
0
10 Jun 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
24
54
0
09 Jun 2019
Balanced off-policy evaluation in general action spaces
A. Sondhi
David Arbour
Drew Dimmery
OffRL
29
17
0
09 Jun 2019
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
23
9
0
07 Jun 2019
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
16
22
0
15 Jan 2019
CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and Learning
Yi-Hsun Su
Lequn Wang
Michele Santacatterina
Mohsen Guizani
CML
OffRL
15
6
0
06 Nov 2018
Confounding-Robust Policy Improvement
Nathan Kallus
Angela Zhou
CML
OffRL
40
152
0
22 May 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
30
126
0
27 Feb 2018
Active Learning with Logged Data
Songbai Yan
Kamalika Chaudhuri
T. Javidi
34
27
0
25 Feb 2018
Policy Evaluation and Optimization with Continuous Treatments
Nathan Kallus
Angela Zhou
OffRL
13
132
0
16 Feb 2018
Estimation Considerations in Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
34
69
0
19 Nov 2017
1
2
Next