Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.11771
Cited By
Batch Policy Learning in Average Reward Markov Decision Processes
23 July 2020
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
S. Murphy
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Batch Policy Learning in Average Reward Markov Decision Processes"
8 / 58 papers shown
Title
Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Jonathan D. Chang
Masatoshi Uehara
Dhruv Sreenivas
Rahul Kidambi
Wen Sun
OffRL
24
32
0
06 Jun 2021
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
12
35
0
10 May 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
28
273
0
22 Mar 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
27
346
0
30 Dec 2020
Robust Batch Policy Learning in Markov Decision Processes
Zhengling Qi
Peng Liao
OffRL
6
4
0
09 Nov 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CML
OffRL
30
33
0
05 Feb 2020
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
38
181
0
22 Aug 2019
A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward
S. Murphy
Yanzhen Deng
Eric B. Laber
H. Maei
R. Sutton
K. Witkiewitz
OffRL
33
22
0
18 Jul 2016
Previous
1
2