ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1604.00923
  4. Cited By
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

4 April 2016
Philip S. Thomas
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning"

50 / 342 papers shown
Title
Towards Safe Policy Improvement for Non-Stationary MDPs
Towards Safe Policy Improvement for Non-Stationary MDPs
Yash Chandak
Scott M. Jordan
Georgios Theocharous
Martha White
Philip S. Thomas
OffRL
71
33
0
23 Oct 2020
What are the Statistical Limits of Offline RL with Linear Function
  Approximation?
What are the Statistical Limits of Offline RL with Linear Function Approximation?
Ruosong Wang
Dean Phillips Foster
Sham Kakade
OffRL
30
158
0
22 Oct 2020
CoinDICE: Off-Policy Confidence Interval Estimation
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
29
84
0
22 Oct 2020
Optimal Off-Policy Evaluation from Multiple Logging Policies
Optimal Off-Policy Evaluation from Multiple Logging Policies
Nathan Kallus
Yuta Saito
Masatoshi Uehara
OffRL
19
40
0
21 Oct 2020
Human-centric Dialog Training via Offline Reinforcement Learning
Human-centric Dialog Training via Offline Reinforcement Learning
Natasha Jaques
J. Shen
Asma Ghandeharioun
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
40
93
0
12 Oct 2020
Online Safety Assurance for Deep Reinforcement Learning
Online Safety Assurance for Deep Reinforcement Learning
Noga H. Rotman
Michael Schapira
Aviv Tamar
OffRL
41
5
0
07 Oct 2020
Information Theoretic Counterfactual Learning from Missing-Not-At-Random
  Feedback
Information Theoretic Counterfactual Learning from Missing-Not-At-Random Feedback
Zifeng Wang
Xi Chen
Rui Wen
Shao-Lun Huang
E. Kuruoglu
Yefeng Zheng
BDL
CML
OffRL
14
81
0
06 Sep 2020
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible
  Off-Policy Evaluation
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
24
73
0
17 Aug 2020
Accountable Off-Policy Evaluation With Kernel Bellman Statistics
Accountable Off-Policy Evaluation With Kernel Bellman Statistics
Yihao Feng
Tongzheng Ren
Ziyang Tang
Qiang Liu
OffRL
18
42
0
15 Aug 2020
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with
  Latent Confounders
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders
Andrew Bennett
Nathan Kallus
Lihong Li
Ali Mousavi
OffRL
35
43
0
27 Jul 2020
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy
  Evaluation
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation
Ilya Kostrikov
Ofir Nachum
OffRL
9
29
0
27 Jul 2020
Counterfactual Evaluation of Slate Recommendations with Sequential
  Reward Interactions
Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions
James McInerney
B. Brost
Praveen Chandar
Rishabh Mehrotra
Ben Carterette
BDL
CML
OffRL
121
55
0
25 Jul 2020
Clinician-in-the-Loop Decision Making: Reinforcement Learning with
  Near-Optimal Set-Valued Policies
Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies
Shengpu Tang
Aditya Modi
Michael Sjoding
Jenna Wiens
OffRL
22
25
0
24 Jul 2020
Batch Policy Learning in Average Reward Markov Decision Processes
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
Susan Murphy
OffRL
36
81
0
23 Jul 2020
Hyperparameter Selection for Offline Reinforcement Learning
Hyperparameter Selection for Offline Reinforcement Learning
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
GP
OffRL
49
146
0
17 Jul 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation
  for Reinforcement Learning
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
44
31
0
07 Jul 2020
Counterfactual Data Augmentation using Locally Factored Dynamics
Counterfactual Data Augmentation using Locally Factored Dynamics
Silviu Pitis
Elliot Creager
Animesh Garg
BDL
OffRL
28
87
0
06 Jul 2020
Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games
Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games
Kenshi Abe
Yusuke Kaneko
OffRL
14
2
0
04 Jul 2020
Learning to search efficiently for causally near-optimal treatments
Learning to search efficiently for causally near-optimal treatments
Samuel Håkansson
Viktor Lindblom
Omer Gottesman
Fredrik D. Johansson
CML
6
6
0
02 Jul 2020
Expert-Supervised Reinforcement Learning for Offline Policy Learning and
  Evaluation
Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation
W. AaronSonabend
Junwei Lu
Leo Anthony Celi
Tianxi Cai
Peter Szolovits
OffRL
16
24
0
23 Jun 2020
Off-policy Bandits with Deficient Support
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
38
75
0
16 Jun 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
46
592
0
16 Jun 2020
Self-Imitation Learning via Generalized Lower Bound Q-learning
Self-Imitation Learning via Generalized Lower Bound Q-learning
Yunhao Tang
SSL
33
24
0
12 Jun 2020
Bandits with Partially Observable Confounded Data
Bandits with Partially Observable Confounded Data
Guy Tennenholtz
Uri Shalit
Shie Mannor
Yonathan Efroni
OffRL
9
23
0
11 Jun 2020
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
GAN
SyDa
25
18
0
08 Jun 2020
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
  Policies
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
Nathan Kallus
Masatoshi Uehara
OffRL
16
15
0
06 Jun 2020
Efficient Evaluation of Natural Stochastic Policies in Offline
  Reinforcement Learning
Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
6
8
0
06 Jun 2020
Causality and Batch Reinforcement Learning: Complementary Approaches To
  Planning In Unknown Domains
Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains
James Bannon
Bradford T. Windsor
Wenbo Song
Tao Li
CML
OOD
OffRL
26
20
0
03 Jun 2020
Optimizing for the Future in Non-Stationary MDPs
Optimizing for the Future in Non-Stationary MDPs
Yash Chandak
Georgios Theocharous
Shiv Shankar
Martha White
Sridhar Mahadevan
Philip S. Thomas
OffRL
20
65
0
17 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
358
1,974
0
04 May 2020
Emergent Real-World Robotic Skills via Unsupervised Off-Policy
  Reinforcement Learning
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning
Archit Sharma
Michael Ahn
Sergey Levine
Vikash Kumar
Karol Hausman
S. Gu
SSL
OffRL
26
46
0
27 Apr 2020
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
29
38
0
22 Apr 2020
A Game Theoretic Framework for Model Based Reinforcement Learning
A Game Theoretic Framework for Model Based Reinforcement Learning
Aravind Rajeswaran
Igor Mordatch
Vikash Kumar
OffRL
25
127
0
16 Apr 2020
Power Constrained Bandits
Power Constrained Bandits
Jiayu Yao
Emma Brunskill
Weiwei Pan
Susan Murphy
Finale Doshi-Velez
21
36
0
13 Apr 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement
  Learning
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
29
32
0
24 Mar 2020
Optimizing Medical Treatment for Sepsis in Intensive Care: from
  Reinforcement Learning to Pre-Trial Evaluation
Optimizing Medical Treatment for Sepsis in Intensive Care: from Reinforcement Learning to Pre-Trial Evaluation
Luchen Li
I. Albert-Smet
Aldo A. Faisal
OffRL
20
10
0
13 Mar 2020
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved
  Confounding
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding
Hongseok Namkoong
Ramtin Keramati
Steve Yadlowsky
Emma Brunskill
OffRL
24
63
0
12 Mar 2020
Batch Stationary Distribution Estimation
Batch Stationary Distribution Estimation
Junfeng Wen
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
24
22
0
02 Mar 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
32
149
0
21 Feb 2020
GenDICE: Generalized Offline Estimation of Stationary Values
GenDICE: Generalized Offline Estimation of Stationary Values
Ruiyi Zhang
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
14
172
0
21 Feb 2020
Debiased Off-Policy Evaluation for Recommendation Systems
Debiased Off-Policy Evaluation for Recommendation Systems
Yusuke Narita
Shota Yasui
Kohei Yata
OffRL
16
11
0
20 Feb 2020
Adaptive Estimator Selection for Off-Policy Evaluation
Adaptive Estimator Selection for Off-Policy Evaluation
Yi-Hsun Su
Pavithra Srinath
A. Krishnamurthy
OffRL
8
45
0
18 Feb 2020
Double/Debiased Machine Learning for Dynamic Treatment Effects via
  g-Estimation
Double/Debiased Machine Learning for Dynamic Treatment Effects via g-Estimation
Greg Lewis
Vasilis Syrgkanis
CML
6
35
0
17 Feb 2020
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement
  Learning
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning
Nathan Kallus
Angela Zhou
OffRL
38
58
0
11 Feb 2020
Statistically Efficient Off-Policy Policy Gradients
Statistically Efficient Off-Policy Policy Gradients
Nathan Kallus
Masatoshi Uehara
OffRL
24
37
0
10 Feb 2020
Interpretable Off-Policy Evaluation in Reinforcement Learning by
  Highlighting Influential Transitions
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Omer Gottesman
Joseph D. Futoma
Yao Liu
Soanli Parbhoo
Leo Anthony Celi
Emma Brunskill
Finale Doshi-Velez
OffRL
147
56
0
10 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
  Learning Framework
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CML
OffRL
35
33
0
05 Feb 2020
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement
  Learning
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Ming Yin
Yu Wang
OffRL
29
80
0
29 Jan 2020
On the Fairness of Randomized Trials for Recommendation with Heterogeneous Demographics and Beyond
Zifeng Wang
Xi Chen
Rui Wen
Shao-Lun Huang
22
1
0
25 Jan 2020
Off-Policy Estimation of Long-Term Average Outcomes with Applications to
  Mobile Health
Off-Policy Estimation of Long-Term Average Outcomes with Applications to Mobile Health
Peng Liao
P. Klasnja
Susan Murphy
OffRL
27
66
0
30 Dec 2019
Previous
1234567
Next