Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.03722
Cited By
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
11 November 2015
Nan Jiang
Lihong Li
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Doubly Robust Off-policy Value Evaluation for Reinforcement Learning"
50 / 163 papers shown
Title
Human-centric Dialog Training via Offline Reinforcement Learning
Natasha Jaques
J. Shen
Asma Ghandeharioun
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
40
93
0
12 Oct 2020
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
24
73
0
17 Aug 2020
Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies
Shengpu Tang
Aditya Modi
Michael Sjoding
Jenna Wiens
OffRL
22
25
0
24 Jul 2020
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
Susan Murphy
OffRL
36
81
0
23 Jul 2020
Hyperparameter Selection for Offline Reinforcement Learning
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
GP
OffRL
49
146
0
17 Jul 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
44
31
0
07 Jul 2020
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
38
75
0
16 Jun 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
46
592
0
16 Jun 2020
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
Nathan Kallus
Masatoshi Uehara
OffRL
16
15
0
06 Jun 2020
Optimizing for the Future in Non-Stationary MDPs
Yash Chandak
Georgios Theocharous
Shiv Shankar
Martha White
Sridhar Mahadevan
Philip S. Thomas
OffRL
18
65
0
17 May 2020
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
29
38
0
22 Apr 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
27
32
0
24 Mar 2020
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding
Hongseok Namkoong
Ramtin Keramati
Steve Yadlowsky
Emma Brunskill
OffRL
24
63
0
12 Mar 2020
Batch Stationary Distribution Estimation
Junfeng Wen
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
22
22
0
02 Mar 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
32
149
0
21 Feb 2020
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning
Nathan Kallus
Angela Zhou
OffRL
38
58
0
11 Feb 2020
Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations
Ioana Bica
Ahmed Alaa
James Jordon
M. Schaar
BDL
CML
16
180
0
10 Feb 2020
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Omer Gottesman
Joseph D. Futoma
Yao Liu
Soanli Parbhoo
Leo Anthony Celi
Emma Brunskill
Finale Doshi-Velez
OffRL
147
56
0
10 Feb 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Nan Jiang
Jiawei Huang
OffRL
41
17
0
06 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CML
OffRL
32
33
0
05 Feb 2020
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Ming Yin
Yu Wang
OffRL
29
80
0
29 Jan 2020
Reinforcement Learning via Fenchel-Rockafellar Duality
Ofir Nachum
Bo Dai
OffRL
16
118
0
07 Jan 2020
Off-Policy Estimation of Long-Term Average Outcomes with Applications to Mobile Health
Peng Liao
P. Klasnja
Susan Murphy
OffRL
27
66
0
30 Dec 2019
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
32
152
0
15 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
31
184
0
28 Oct 2019
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Xinyue Chen
Zijian Zhou
Junyao Xing
Che Wang
Yanqiu Wu
Keith Ross
OffRL
35
121
0
27 Oct 2019
From Importance Sampling to Doubly Robust Policy Gradient
Jiawei Huang
Nan Jiang
OffRL
35
24
0
20 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
30
67
0
16 Oct 2019
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Yao Liu
Pierre-Luc Bacon
Emma Brunskill
OffRL
22
45
0
15 Oct 2019
Meta-Q-Learning
Rasool Fakoor
Pratik Chaudhari
Stefano Soatto
Alex Smola
OffRL
33
145
0
30 Sep 2019
Causal Modeling for Fairness in Dynamical Systems
Elliot Creager
David Madras
T. Pitassi
R. Zemel
29
67
0
18 Sep 2019
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
26
88
0
12 Sep 2019
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity
Peng Liao
Kristjan Greenewald
P. Klasnja
Susan Murphy
25
83
0
08 Sep 2019
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
43
183
0
22 Aug 2019
Doubly-Robust Lasso Bandit
Gi-Soo Kim
M. Paik
24
61
0
26 Jul 2019
Task Selection Policies for Multitask Learning
John Glover
Chris Hokamp
OffRL
29
7
0
14 Jul 2019
A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms
Oliver Kroemer
S. Niekum
George Konidaris
41
356
0
06 Jul 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
13
328
0
10 Jun 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
24
54
0
09 Jun 2019
Learning When-to-Treat Policies
Xinkun Nie
Emma Brunskill
Stefan Wager
CML
OffRL
26
89
0
23 May 2019
Experimental Evaluation of Individualized Treatment Rules
Kosuke Imai
Michael Lingzhi Li
9
38
0
14 May 2019
Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs
Marek Petrik
R. Russel
29
61
0
20 Feb 2019
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
16
22
0
15 Jan 2019
Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search
Lars Buesing
T. Weber
Yori Zwols
S. Racanière
A. Guez
Jean-Baptiste Lespiau
N. Heess
CML
37
135
0
15 Nov 2018
Policy Certificates: Towards Accountable Reinforcement Learning
Christoph Dann
Ashutosh Adhikari
Wei Wei
Jimmy J. Lin
OffRL
25
141
0
07 Nov 2018
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
49
670
0
21 Sep 2018
Per-decision Multi-step Temporal Difference Learning with Control Variates
Kristopher De Asis
R. Sutton
24
7
0
05 Jul 2018
Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters
Aniruddh Raghu
Omer Gottesman
Yao Liu
Matthieu Komorowski
A. Faisal
Finale Doshi-Velez
Emma Brunskill
OffRL
33
33
0
03 Jul 2018
Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning
Julia Kreutzer
Joshua Uyheng
Stefan Riezler
33
85
0
27 May 2018
Representation Balancing MDPs for Off-Policy Policy Evaluation
Yao Liu
Omer Gottesman
Aniruddh Raghu
Matthieu Komorowski
A. Faisal
Finale Doshi-Velez
Emma Brunskill
OffRL
27
75
0
23 May 2018
Previous
1
2
3
4
Next