ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1604.00923
  4. Cited By
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

4 April 2016
Philip S. Thomas
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning"

50 / 342 papers shown
Title
More Efficient Off-Policy Evaluation through Regularized Targeted
  Learning
More Efficient Off-Policy Evaluation through Regularized Targeted Learning
Aurélien F. Bibaut
Ivana Malenica
N. Vlassis
Mark van der Laan
OOD
OffRL
6
38
0
13 Dec 2019
Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement
  Learning
Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning
Riashat Islam
Raihan Seraj
Samin Yeasar Arnob
Doina Precup
OffRL
17
3
0
11 Dec 2019
Merging Deterministic Policy Gradient Estimations with Varied
  Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Gang Chen
25
4
0
24 Nov 2019
Empirical Study of Off-Policy Policy Evaluation for Reinforcement
  Learning
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
35
152
0
15 Nov 2019
Triply Robust Off-Policy Evaluation
Triply Robust Off-Policy Evaluation
Anqi Liu
Hao Liu
Anima Anandkumar
Yisong Yue
OffRL
35
10
0
13 Nov 2019
Model-Based Reinforcement Learning with Adversarial Training for Online
  Recommendation
Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation
Xueying Bai
Jian Guan
Hongning Wang
OffRL
14
75
0
10 Nov 2019
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement
  Learning
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Xinyue Chen
Zijian Zhou
Junyao Xing
Che Wang
Yanqiu Wu
Keith Ross
OffRL
35
121
0
27 Oct 2019
From Importance Sampling to Doubly Robust Policy Gradient
From Importance Sampling to Doubly Robust Policy Gradient
Jiawei Huang
Nan Jiang
OffRL
35
24
0
20 Oct 2019
Adaptive Trade-Offs in Off-Policy Learning
Adaptive Trade-Offs in Off-Policy Learning
Mark Rowland
Will Dabney
Rémi Munos
OffRL
25
22
0
16 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
30
67
0
16 Oct 2019
A unified view of likelihood ratio and reparameterization gradients and
  an optimal importance sampling scheme
A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme
Paavo Parmas
Masashi Sugiyama
24
3
0
14 Oct 2019
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior
  Policies
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
Xinyun Chen
Lu Wang
Yizhe Hang
Heng Ge
H. Zha
OffRL
14
5
0
10 Oct 2019
Benchmarking Batch Deep Reinforcement Learning Algorithms
Benchmarking Batch Deep Reinforcement Learning Algorithms
Shih-Han Chou
Wen-Yen Chang
W. Hsu
Jianlong Fu
OffRL
27
182
0
03 Oct 2019
Nearly Consistent Finite Particle Estimates in Streaming Importance
  Sampling
Nearly Consistent Finite Particle Estimates in Streaming Importance Sampling
Alec Koppel
Amrit Singh Bedi
Brian M. Sadler
Victor Elvira
37
2
0
23 Sep 2019
Causal Modeling for Fairness in Dynamical Systems
Causal Modeling for Fairness in Dynamical Systems
Elliot Creager
David Madras
T. Pitassi
R. Zemel
29
67
0
18 Sep 2019
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with
  Double Reinforcement Learning
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
26
88
0
12 Sep 2019
Off-Policy Evaluation in Partially Observable Environments
Off-Policy Evaluation in Partially Observable Environments
Guy Tennenholtz
Shie Mannor
Uri Shalit
OffRL
22
85
0
09 Sep 2019
Personalized HeartSteps: A Reinforcement Learning Algorithm for
  Optimizing Physical Activity
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity
Peng Liao
Kristjan Greenewald
P. Klasnja
Susan Murphy
25
83
0
08 Sep 2019
Double Reinforcement Learning for Efficient Off-Policy Evaluation in
  Markov Decision Processes
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
46
183
0
22 Aug 2019
Reinforcement Learning in Healthcare: A Survey
Reinforcement Learning in Healthcare: A Survey
Chao Yu
Jiming Liu
S. Nemati
LM&MA
OffRL
33
551
0
22 Aug 2019
Batch Recurrent Q-Learning for Backchannel Generation Towards Engaging
  Agents
Batch Recurrent Q-Learning for Backchannel Generation Towards Engaging Agents
Nusrah Hussain
E. Erzin
T. Metin Sezgin
Y. Yemez
OffRL
19
7
0
06 Aug 2019
Speech Driven Backchannel Generation using Deep Q-Network for Enhancing
  Engagement in Human-Robot Interaction
Speech Driven Backchannel Generation using Deep Q-Network for Enhancing Engagement in Human-Robot Interaction
Nusrah Hussain
E. Erzin
T. Metin Sezgin
Y. Yemez
11
0
0
05 Aug 2019
Off-policy Learning for Multiple Loggers
Off-policy Learning for Multiple Loggers
Li He
Long Xia
Wei Zeng
Zhi-Ming Ma
Yue Zhao
Dawei Yin
OffRL
20
10
0
23 Jul 2019
Low-Variance and Zero-Variance Baselines for Extensive-Form Games
Low-Variance and Zero-Variance Baselines for Extensive-Form Games
Trevor Davis
Martin Schmid
Michael Bowling
OffRL
19
19
0
22 Jul 2019
Doubly robust off-policy evaluation with shrinkage
Doubly robust off-policy evaluation with shrinkage
Yi-Hsun Su
Maria Dimakopoulou
A. Krishnamurthy
Miroslav Dudík
OffRL
27
104
0
22 Jul 2019
An Optimistic Perspective on Offline Reinforcement Learning
An Optimistic Perspective on Offline Reinforcement Learning
Rishabh Agarwal
Dale Schuurmans
Mohammad Norouzi
OffRL
OnRL
38
69
0
10 Jul 2019
A Review of Robot Learning for Manipulation: Challenges,
  Representations, and Algorithms
A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms
Oliver Kroemer
S. Niekum
George Konidaris
41
356
0
06 Jul 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
45
337
0
30 Jun 2019
Expected Sarsa($λ$) with Control Variate for Variance Reduction
Expected Sarsa(λλλ) with Control Variate for Variance Reduction
Long Yang
Yu Zhang
Jun Wen
Qian Zheng
Pengfei Li
Gang Pan
27
0
0
25 Jun 2019
Bias Correction of Learned Generative Models using Likelihood-Free
  Importance Weighting
Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting
Aditya Grover
Jiaming Song
Alekh Agarwal
Kenneth Tran
Ashish Kapoor
Eric Horvitz
Stefano Ermon
26
123
0
23 Jun 2019
Importance Resampling for Off-policy Prediction
Importance Resampling for Off-policy Prediction
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
OffRL
14
41
0
11 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
13
328
0
10 Jun 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for
  Reinforcement Learning
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
30
54
0
09 Jun 2019
Balanced off-policy evaluation in general action spaces
Balanced off-policy evaluation in general action spaces
A. Sondhi
David Arbour
Drew Dimmery
OffRL
29
17
0
09 Jun 2019
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with
  Marginalized Importance Sampling
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
57
178
0
08 Jun 2019
Off-Policy Evaluation via Off-Policy Classification
Off-Policy Evaluation via Off-Policy Classification
A. Irpan
Kanishka Rao
Konstantinos Bousmalis
Chris Harris
Julian Ibarz
Sergey Levine
OffRL
19
50
0
04 Jun 2019
Learning When-to-Treat Policies
Learning When-to-Treat Policies
Xinkun Nie
Emma Brunskill
Stefan Wager
CML
OffRL
29
89
0
23 May 2019
Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal
  Models
Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models
Michael Oberst
David Sontag
CML
OffRL
26
169
0
14 May 2019
Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Omer Gottesman
Yao Liu
Scott Sussex
Emma Brunskill
Finale Doshi-Velez
OffRL
33
33
0
14 May 2019
Challenges of Real-World Reinforcement Learning
Challenges of Real-World Reinforcement Learning
Gabriel Dulac-Arnold
D. Mankowitz
Todd Hester
OffRL
37
543
0
29 Apr 2019
Off-Policy Policy Gradient with State Distribution Correction
Off-Policy Policy Gradient with State Distribution Correction
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
21
67
0
17 Apr 2019
Truly Batch Apprenticeship Learning with Deep Successor Features
Truly Batch Apprenticeship Learning with Deep Successor Features
Donghun Lee
Srivatsan Srinivasan
Finale Doshi-Velez
OffRL
OOD
25
35
0
24 Mar 2019
Machine Learning Methods Economists Should Know About
Machine Learning Methods Economists Should Know About
Susan Athey
Guido Imbens
37
668
0
24 Mar 2019
Batch Policy Learning under Constraints
Batch Policy Learning under Constraints
Hoang Minh Le
Cameron Voloshin
Yisong Yue
OffRL
16
320
0
20 Mar 2019
Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs
Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs
Marek Petrik
R. Russel
35
61
0
20 Feb 2019
Imitation-Regularized Offline Learning
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
27
22
0
15 Jan 2019
Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based
  Reinforcement Learning
Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning
Xuefeng Peng
Yi Ding
David Wihl
Omer Gottesman
Matthieu Komorowski
Li-wei H. Lehman
A. Ross
A. Faisal
Finale Doshi-Velez
OffRL
27
86
0
15 Jan 2019
Stochastic Doubly Robust Gradient
Stochastic Doubly Robust Gradient
Kanghoon Lee
Jihye Choi
Moonsu Cha
Jung Kwon Lee
Tae-Yoon Kim
21
0
0
21 Dec 2018
Balanced Linear Contextual Bandits
Balanced Linear Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
18
64
0
15 Dec 2018
Top-K Off-Policy Correction for a REINFORCE Recommender System
Top-K Off-Policy Correction for a REINFORCE Recommender System
Minmin Chen
Alex Beutel
Paul Covington
Sagar Jain
Francois Belletti
Ed H. Chi
CML
OffRL
33
474
0
06 Dec 2018
Previous
1234567
Next