Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.12429
Cited By
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
29 October 2018
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation"
50 / 252 papers shown
Title
Truly Deterministic Policy Optimization
Ehsan Saleh
Saba Ghaffari
Timothy Bretl
Matthew West
OffRL
57
3
0
30 May 2022
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
Jiawei Huang
Li Zhao
Tao Qin
Wei Chen
Nan Jiang
Tie-Yan Liu
OffRL
56
3
0
25 May 2022
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
David Bruns-Smith
CML
ELM
OffRL
63
13
0
02 Apr 2022
Marginalized Operators for Off-policy Reinforcement Learning
Yunhao Tang
Mark Rowland
Rémi Munos
Michal Valko
OffRL
61
0
0
30 Mar 2022
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
120
35
0
25 Mar 2022
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
OffRL
101
8
0
24 Mar 2022
Importance Sampling Placement in Off-Policy Temporal-Difference Methods
Eric Graves
Sina Ghiassian
OffRL
77
2
0
18 Mar 2022
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning
Jinxin Liu
Hongyin Zhang
Donglin Wang
OffRL
85
37
0
13 Mar 2022
A Complete Characterization of Linear Estimators for Offline Policy Evaluation
Juan C. Perdomo
A. Krishnamurthy
Peter L. Bartlett
Sham Kakade
OffRL
69
3
0
08 Mar 2022
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons
C. Shi
Shuang Luo
Yuan Le
Hongtu Zhu
R. Song
OffRL
OnRL
74
12
0
26 Feb 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
69
9
0
23 Feb 2022
Continual Auxiliary Task Learning
Matt McLeod
Chun-Ping Lo
M. Schlegel
Andrew Jacobsen
Raksha Kumaraswamy
Martha White
Adam White
CLL
60
9
0
22 Feb 2022
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Shuang Luo
Ying Yang
Chengchun Shi
Fang Yao
Jieping Ye
Hongtu Zhu
110
8
0
22 Feb 2022
Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process
C. Shi
Jin Zhu
Ye Shen
Shuang Luo
Hong Zhu
R. Song
OffRL
115
34
0
22 Feb 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
73
6
0
21 Feb 2022
A Behavior Regularized Implicit Policy for Offline Reinforcement Learning
Shentao Yang
Zhendong Wang
Huangjie Zheng
Yihao Feng
Mingyuan Zhou
OffRL
57
9
0
19 Feb 2022
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
Jiawei Huang
Jinglin Chen
Li Zhao
Tao Qin
Nan Jiang
Tie-Yan Liu
OffRL
91
24
0
14 Feb 2022
Off-Policy Evaluation for Large Action Spaces via Embeddings
Yuta Saito
Thorsten Joachims
OffRL
82
46
0
13 Feb 2022
Offline Reinforcement Learning with Realizability and Single-policy Concentrability
Wenhao Zhan
Baihe Huang
Audrey Huang
Nan Jiang
Jason D. Lee
OffRL
394
112
0
09 Feb 2022
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning
Jing-rong Dong
Xin T. Tong
OffRL
89
2
0
06 Feb 2022
A Temporal-Difference Approach to Policy Gradient Estimation
Samuele Tosatto
Andrew Patterson
Martha White
A. R. Mahmood
OffRL
104
2
0
04 Feb 2022
Robust Imitation Learning from Corrupted Demonstrations
Liu Liu
Ziyang Tang
Lanqing Li
Dijun Luo
71
13
0
29 Jan 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
Xiaohong Chen
Zhengling Qi
OffRL
91
35
0
17 Jan 2022
Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning
Ziyang Tang
Yihao Feng
Qiang Liu
OffRL
43
1
0
01 Jan 2022
Off Environment Evaluation Using Convex Risk Minimization
Pulkit Katdare
Shuijing Liu
Katherine Driggs-Campbell
47
2
0
21 Dec 2021
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency
Mingfei Sun
Sam Devlin
Katja Hofmann
Shimon Whiteson
30
4
0
11 Dec 2021
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
93
4
0
29 Nov 2021
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
C. Shi
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
90
26
0
12 Nov 2021
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
93
5
0
06 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
107
12
0
04 Nov 2021
Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning
Siyuan Zhang
Nan Jiang
OffRL
93
39
0
26 Oct 2021
False Correlation Reduction for Offline Reinforcement Learning
Arvindkumar Krishnakumar
Zuyue Fu
Lingxiao Wang
Zhuoran Yang
Chenjia Bai
Tianyi Zhou
Judy Hoffman
Jing Jiang
OffRL
76
9
0
24 Oct 2021
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Raghuram Bharadwaj Diddigi
Prateek Jain
P. J
S. Bhatnagar
CML
OffRL
81
3
0
19 Oct 2021
Offline Reinforcement Learning with Soft Behavior Regularization
Haoran Xu
Xianyuan Zhan
Jianxiong Li
Honglei Yin
OffRL
79
31
0
14 Oct 2021
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CML
FAtt
OffRL
63
2
0
06 Oct 2021
The
f
f
f
-Divergence Reinforcement Learning Framework
Chen Gong
Qiang He
Yunpeng Bai
Zhouyi Yang
Xiaoyu Chen
Xinwen Hou
Xianjie Zhang
Yu Liu
Guoliang Fan
68
3
0
24 Sep 2021
DROMO: Distributionally Robust Offline Model-based Policy Optimization
Ruizhen Liu
Dazhi Zhong
Zhi-Cong Chen
OffRL
53
3
0
15 Sep 2021
State Relevance for Off-Policy Evaluation
S. Shen
Yecheng Ma
Omer Gottesman
Finale Doshi-Velez
OffRL
59
4
0
13 Sep 2021
Projected State-action Balancing Weights for Offline Reinforcement Learning
Jiayi Wang
Zhengling Qi
Raymond K. W. Wong
OffRL
73
19
0
10 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
95
119
0
19 Aug 2021
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Shangtong Zhang
Shimon Whiteson
OffRL
72
12
0
11 Aug 2021
Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings
Shengpu Tang
Jenna Wiens
OffRL
91
81
0
23 Jul 2021
Conservative Offline Distributional Reinforcement Learning
Yecheng Jason Ma
Dinesh Jayaraman
Osbert Bastani
OffRL
104
83
0
12 Jul 2021
Learning Expected Emphatic Traces for Deep RL
Ray Jiang
Shangtong Zhang
Veronica Chelu
Adam White
Hado van Hasselt
OffRL
60
12
0
12 Jul 2021
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
63
6
0
03 Jul 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Yifei Min
Tianhao Wang
Dongruo Zhou
Quanquan Gu
OffRL
89
38
0
22 Jun 2021
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
OffRL
57
11
0
22 Jun 2021
The Curse of Passive Data Collection in Batch Reinforcement Learning
Chenjun Xiao
Ilbin Lee
Bo Dai
Dale Schuurmans
Csaba Szepesvári
OffRL
64
1
0
18 Jun 2021
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Scott Fujimoto
David Meger
Doina Precup
70
17
0
12 Jun 2021
Instrument Space Selection for Kernel Maximum Moment Restriction
Rui Zhang
Krikamol Muandet
Bernhard Schölkopf
Masaaki Imaizumi
64
3
0
07 Jun 2021
Previous
1
2
3
4
5
6
Next