Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.12429
Cited By
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
29 October 2018
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation"
50 / 252 papers shown
Title
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
98
5
0
02 Jun 2021
The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for Speed-Accuracy Optimization
Taiki Miyagawa
Akinori F. Ebihara
50
3
0
28 May 2021
On Instrumental Variable Regression for Deep Offline Policy Evaluation
Yutian Chen
Liyuan Xu
Çağlar Gülçehre
T. Paine
Arthur Gretton
Nando de Freitas
Arnaud Doucet
OffRL
117
18
0
21 May 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
97
19
0
13 May 2021
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
53
38
0
10 May 2021
Towards Theoretical Understandings of Robust Markov Decision Processes: Sample Complexity and Asymptotics
Wenhao Yang
Liangyu Zhang
Zhihua Zhang
72
35
0
09 May 2021
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Michael Ruogu Zhang
T. Paine
Ofir Nachum
Cosmin Paduraru
George Tucker
Ziyun Wang
Mohammad Norouzi
OffRL
86
49
0
28 Apr 2021
Universal Off-Policy Evaluation
Yash Chandak
S. Niekum
Bruno C. da Silva
Erik Learned-Miller
Emma Brunskill
Philip S. Thomas
OffRL
ELM
101
53
0
26 Apr 2021
Nearly Horizon-Free Offline Reinforcement Learning
Zhaolin Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
83
49
0
25 Mar 2021
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds
Yihao Feng
Ziyang Tang
Na Zhang
Qiang Liu
OffRL
73
14
0
09 Mar 2021
Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham Kakade
OffRL
155
42
0
08 Mar 2021
On the Convergence and Optimality of Policy Gradient for Markov Coherent Risk
Audrey Huang
Liu Leqi
Zachary Chase Lipton
Kamyar Azizzadenesheli
77
21
0
04 Mar 2021
Minimax Model Learning
Cameron Voloshin
Nan Jiang
Yisong Yue
OffRL
112
18
0
02 Mar 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
104
25
0
23 Feb 2021
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Botao Hao
X. Ji
Yaqi Duan
Hao Lu
Csaba Szepesvári
Mengdi Wang
OffRL
75
40
0
06 Feb 2021
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction
Ming Yin
Yu Bai
Yu Wang
OffRL
93
69
0
02 Feb 2021
Fast Rates for the Regret of Offline Reinforcement Learning
Yichun Hu
Nathan Kallus
Masatoshi Uehara
OffRL
112
30
0
31 Jan 2021
High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Yash Chandak
Shiv Shankar
Philip S. Thomas
OffRL
31
8
0
25 Jan 2021
Average-Reward Off-Policy Policy Evaluation with Function Approximation
Shangtong Zhang
Yi Wan
R. Sutton
Shimon Whiteson
OffRL
73
31
0
08 Jan 2021
SDA: Improving Text Generation with Self Data Augmentation
Ping Yu
Ruiyi Zhang
Yang Zhao
Yizhe Zhang
Chunyuan Li
Changyou Chen
38
2
0
02 Jan 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
193
360
0
30 Dec 2020
The Variational Method of Moments
Andrew Bennett
Nathan Kallus
98
30
0
17 Dec 2020
Policy Optimization as Online Learning with Mediator Feedback
Alberto Maria Metelli
Matteo Papini
P. DÓro
Marcello Restelli
OffRL
54
10
0
15 Dec 2020
Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL
Andrea Zanette
OffRL
206
71
0
14 Dec 2020
Offline Policy Selection under Uncertainty
Mengjiao Yang
Bo Dai
Ofir Nachum
George Tucker
Dale Schuurmans
OffRL
57
35
0
12 Dec 2020
Optimal Mixture Weights for Off-Policy Evaluation with Multiple Behavior Policies
Jinlin Lai
Lixin Zou
Jiaxing Song
OffRL
15
1
0
29 Nov 2020
C-Learning: Learning to Achieve Goals via Recursive Classification
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
OffRL
78
71
0
17 Nov 2020
Reliable Off-policy Evaluation for Reinforcement Learning
Jie Wang
Rui Gao
H. Zha
OffRL
79
11
0
08 Nov 2020
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
Botao Hao
Yaqi Duan
Tor Lattimore
Csaba Szepesvári
Mengdi Wang
OffRL
142
27
0
08 Nov 2020
Off-Policy Interval Estimation with Lipschitz Value Iteration
Ziyang Tang
Yihao Feng
Na Zhang
Jian Peng
Qiang Liu
OffRL
47
6
0
29 Oct 2020
Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient
Samuele Tosatto
João Carvalho
Jan Peters
OffRL
62
7
0
27 Oct 2020
What are the Statistical Limits of Offline RL with Linear Function Approximation?
Ruosong Wang
Dean Phillips Foster
Sham Kakade
OffRL
175
164
0
22 Oct 2020
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
76
87
0
22 Oct 2020
Optimal Off-Policy Evaluation from Multiple Logging Policies
Nathan Kallus
Yuta Saito
Masatoshi Uehara
OffRL
71
40
0
21 Oct 2020
Average-reward model-free reinforcement learning: a systematic review and literature mapping
Vektor Dewanto
George Dunn
A. Eshragh
M. Gallagher
Fred Roosta
81
30
0
18 Oct 2020
Instrumental Variable Regression via Kernel Maximum Moment Loss
Rui Zhang
Masaaki Imaizumi
Bernhard Schölkopf
Krikamol Muandet
89
9
0
15 Oct 2020
Offline Learning for Planning: A Summary
Giorgio Angelotti
Nicolas Drougard
Caroline Ponzoni Carvalho Chanel
OffRL
51
4
0
05 Oct 2020
Variance-Reduced Off-Policy Memory-Efficient Policy Search
Daoming Lyu
Qi Qi
Mohammad Ghavamzadeh
Hengshuai Yao
Tianbao Yang
Bo Liu
OffRL
71
7
0
14 Sep 2020
Accountable Off-Policy Evaluation With Kernel Bellman Statistics
Yihao Feng
Zhaolin Ren
Ziyang Tang
Qiang Liu
OffRL
141
44
0
15 Aug 2020
Batch Value-function Approximation with Only Realizability
Tengyang Xie
Nan Jiang
OffRL
404
121
0
11 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
87
43
0
02 Aug 2020
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders
Andrew Bennett
Nathan Kallus
Lihong Li
Ali Mousavi
OffRL
79
44
0
27 Jul 2020
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation
Ilya Kostrikov
Ofir Nachum
OffRL
52
31
0
27 Jul 2020
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
Susan Murphy
OffRL
131
85
0
23 Jul 2020
Hyperparameter Selection for Offline Reinforcement Learning
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
GP
OffRL
195
148
0
17 Jul 2020
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay
Scott Fujimoto
David Meger
Doina Precup
94
58
0
12 Jul 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
93
31
0
07 Jul 2020
Off-Policy Evaluation via the Regularized Lagrangian
Mengjiao Yang
Ofir Nachum
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
56
118
0
07 Jul 2020
Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games
Kenshi Abe
Yusuke Kaneko
OffRL
31
2
0
04 Jul 2020
Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan
A. Naik
R. Sutton
OffRL
76
61
0
29 Jun 2020
Previous
1
2
3
4
5
6
Next