Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.13734
Cited By
v1
v2
v3
v4 (latest)
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
31 January 2023
Shuze Liu
Shangtong Zhang
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design"
19 / 19 papers shown
Title
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Claire Chen
Shuze Liu
Shangtong Zhang
OffRL
362
1
0
08 Oct 2024
Doubly Optimal Policy Evaluation for Reinforcement Learning
Shuze Liu
Claire Chen
Shangtong Zhang
OffRL
177
3
0
03 Oct 2024
ReVar: Strengthening Policy Evaluation via Reduced Variance Sampling
Subhojyoti Mukherjee
Josiah P. Hanna
Robert D. Nowak
OffRL
63
15
0
09 Mar 2022
Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning
Siyuan Zhang
Nan Jiang
OffRL
66
39
0
26 Oct 2021
Batch Value-function Approximation with Only Realizability
Tengyang Xie
Nan Jiang
OffRL
382
121
0
11 Aug 2020
Off-Policy Evaluation via the Regularized Lagrangian
Mengjiao Yang
Ofir Nachum
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
41
118
0
07 Jul 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GP
OffRL
229
1,381
0
15 Apr 2020
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
Shangtong Zhang
Bo Liu
Shimon Whiteson
OffRL
59
103
0
29 Jan 2020
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
151
338
0
10 Jun 2019
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
97
181
0
08 Jun 2019
Information-Theoretic Considerations in Batch Reinforcement Learning
Jinglin Chen
Nan Jiang
OOD
OffRL
161
378
0
01 May 2019
Planning with Expectation Models
Yi Wan
M. Zaheer
Adam White
Martha White
R. Sutton
OffRL
61
24
0
02 Apr 2019
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
236
1,624
0
07 Dec 2018
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
158
356
0
29 Oct 2018
Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods
Craig Sherstan
Brendan Bennett
K. Young
Dylan R. Ashley
Adam White
Martha White
R. Sutton
47
15
0
25 Jan 2018
The Uncertainty Bellman Equation and Exploration
Brendan O'Donoghue
Ian Osband
Rémi Munos
Volodymyr Mnih
70
192
0
15 Sep 2017
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
138
617
0
08 Jun 2016
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
91
272
0
14 Mar 2015
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
R. Sutton
Csaba Szepesvári
A. Geramifard
Michael Bowling
OffRL
86
203
0
13 Jun 2012
1