Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.00076
Cited By
v1
v2
v3 (latest)
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
31 January 2022
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration"
14 / 14 papers shown
Title
Optimal policy evaluation using kernel-based temporal difference methods
Yaqi Duan
Mengdi Wang
Martin J. Wainwright
OffRL
59
27
0
24 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
95
119
0
19 Aug 2021
Bellman-consistent Pessimism for Offline Reinforcement Learning
Tengyang Xie
Ching-An Cheng
Nan Jiang
Paul Mineiro
Alekh Agarwal
OffRL
LRM
154
278
0
13 Jun 2021
Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Jonathan D. Chang
Masatoshi Uehara
Dhruv Sreenivas
Rahul Kidambi
Wen Sun
OffRL
97
32
0
06 Jun 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
225
289
0
22 Mar 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
97
69
0
17 Feb 2021
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Botao Hao
X. Ji
Yaqi Duan
Hao Lu
Csaba Szepesvári
Mengdi Wang
OffRL
51
40
0
06 Feb 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
176
359
0
30 Dec 2020
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
64
140
0
04 Jul 2020
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
97
181
0
08 Jun 2019
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
82
194
0
05 Jun 2019
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient
Pan Xu
F. Gao
Quanquan Gu
67
97
0
29 May 2019
Off-Policy Policy Gradient with State Distribution Correction
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
157
67
0
17 Apr 2019
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
432
576
0
04 Apr 2016
1