Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.02808
Cited By
v1
v2
v3 (latest)
Average-Reward Off-Policy Policy Evaluation with Function Approximation
8 January 2021
Shangtong Zhang
Yi Wan
R. Sutton
Shimon Whiteson
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Average-Reward Off-Policy Policy Evaluation with Function Approximation"
21 / 21 papers shown
Title
Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
Zixuan Xie
Xinyu Liu
Rohan Chandra
Shangtong Zhang
45
0
0
27 May 2025
Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan
A. Naik
R. Sutton
OffRL
69
61
0
29 Jun 2020
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
111
677
0
12 May 2020
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
92
38
0
22 Apr 2020
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
Shangtong Zhang
Bo Liu
Shimon Whiteson
OffRL
97
103
0
29 Jan 2020
AlgaeDICE: Policy Gradient from Arbitrary Experience
Ofir Nachum
Bo Dai
Ilya Kostrikov
Yinlam Chow
Lihong Li
Dale Schuurmans
OffRL
166
244
0
04 Dec 2019
A Convergent Off-Policy Temporal Difference Algorithm
Raghuram Bharadwaj Diddigi
Chandramouli Kamanchi
S. Bhatnagar
OffRL
37
8
0
13 Nov 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
164
69
0
16 Oct 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
155
338
0
10 Jun 2019
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
115
181
0
08 Jun 2019
Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Omer Gottesman
Yao Liu
Scott Sussex
Emma Brunskill
Finale Doshi-Velez
OffRL
100
36
0
14 May 2019
Challenges of Real-World Reinforcement Learning
Gabriel Dulac-Arnold
D. Mankowitz
Todd Hester
OffRL
99
551
0
29 Apr 2019
Planning with Expectation Models
Yi Wan
M. Zaheer
Adam White
Martha White
R. Sutton
OffRL
76
24
0
02 Apr 2019
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
Carles Gelada
Marc G. Bellemare
OffRL
73
99
0
27 Jan 2019
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
177
356
0
29 Oct 2018
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
200
5,226
0
26 Feb 2018
On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning
Huizhen Yu
OffRL
99
32
0
27 Dec 2017
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
147
1,963
0
19 Sep 2017
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
98
272
0
14 Mar 2015
Distributed Policy Evaluation Under Multiple Behavior Strategies
Sergio Valcarcel Macua
Jianshu Chen
S. Zazo
Ali H. Sayed
130
104
0
30 Dec 2013
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
R. Sutton
Csaba Szepesvári
A. Geramifard
Michael Bowling
OffRL
98
204
0
13 Jun 2012
1