ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.12429
  4. Cited By
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

29 October 2018
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation"

50 / 252 papers shown
Title
Truly Deterministic Policy Optimization
Truly Deterministic Policy Optimization
Ehsan Saleh
Saba Ghaffari
Timothy Bretl
Matthew West
OffRL
57
3
0
30 May 2022
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and
  Constant Regret
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
Jiawei Huang
Li Zhao
Tao Qin
Wei Chen
Nan Jiang
Tie-Yan Liu
OffRL
56
3
0
25 May 2022
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
David Bruns-Smith
CMLELMOffRL
63
13
0
02 Apr 2022
Marginalized Operators for Off-policy Reinforcement Learning
Marginalized Operators for Off-policy Reinforcement Learning
Yunhao Tang
Mark Rowland
Rémi Munos
Michal Valko
OffRL
61
0
0
30 Mar 2022
Offline Reinforcement Learning Under Value and Density-Ratio
  Realizability: The Power of Gaps
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
120
35
0
25 Mar 2022
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
OffRL
101
8
0
24 Mar 2022
Importance Sampling Placement in Off-Policy Temporal-Difference Methods
Importance Sampling Placement in Off-Policy Temporal-Difference Methods
Eric Graves
Sina Ghiassian
OffRL
77
2
0
18 Mar 2022
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement
  Learning
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning
Jinxin Liu
Hongyin Zhang
Donglin Wang
OffRL
85
37
0
13 Mar 2022
A Complete Characterization of Linear Estimators for Offline Policy
  Evaluation
A Complete Characterization of Linear Estimators for Offline Policy Evaluation
Juan C. Perdomo
A. Krishnamurthy
Peter L. Bartlett
Sham Kakade
OffRL
69
3
0
08 Mar 2022
Statistically Efficient Advantage Learning for Offline Reinforcement
  Learning in Infinite Horizons
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons
C. Shi
Shuang Luo
Yuan Le
Hongtu Zhu
R. Song
OffRLOnRL
74
12
0
26 Feb 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
69
9
0
23 Feb 2022
Continual Auxiliary Task Learning
Continual Auxiliary Task Learning
Matt McLeod
Chun-Ping Lo
M. Schlegel
Andrew Jacobsen
Raksha Kumaraswamy
Martha White
Adam White
CLL
60
9
0
22 Feb 2022
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Shuang Luo
Ying Yang
Chengchun Shi
Fang Yao
Jieping Ye
Hongtu Zhu
110
8
0
22 Feb 2022
Off-Policy Confidence Interval Estimation with Confounded Markov
  Decision Process
Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process
C. Shi
Jin Zhu
Ye Shen
Shuang Luo
Hong Zhu
R. Song
OffRL
115
34
0
22 Feb 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation
  in Two-sided Markets
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
73
6
0
21 Feb 2022
A Behavior Regularized Implicit Policy for Offline Reinforcement
  Learning
A Behavior Regularized Implicit Policy for Offline Reinforcement Learning
Shentao Yang
Zhendong Wang
Huangjie Zheng
Yihao Feng
Mingyuan Zhou
OffRL
57
9
0
19 Feb 2022
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
  Optimality
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
Jiawei Huang
Jinglin Chen
Li Zhao
Tao Qin
Nan Jiang
Tie-Yan Liu
OffRL
91
24
0
14 Feb 2022
Off-Policy Evaluation for Large Action Spaces via Embeddings
Off-Policy Evaluation for Large Action Spaces via Embeddings
Yuta Saito
Thorsten Joachims
OffRL
82
46
0
13 Feb 2022
Offline Reinforcement Learning with Realizability and Single-policy
  Concentrability
Offline Reinforcement Learning with Realizability and Single-policy Concentrability
Wenhao Zhan
Baihe Huang
Audrey Huang
Nan Jiang
Jason D. Lee
OffRL
394
112
0
09 Feb 2022
Stochastic Gradient Descent with Dependent Data for Offline
  Reinforcement Learning
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning
Jing-rong Dong
Xin T. Tong
OffRL
89
2
0
06 Feb 2022
A Temporal-Difference Approach to Policy Gradient Estimation
A Temporal-Difference Approach to Policy Gradient Estimation
Samuele Tosatto
Andrew Patterson
Martha White
A. R. Mahmood
OffRL
104
2
0
04 Feb 2022
Robust Imitation Learning from Corrupted Demonstrations
Robust Imitation Learning from Corrupted Demonstrations
Liu Liu
Ziyang Tang
Lanqing Li
Dijun Luo
71
13
0
29 Jan 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function
  Estimation in Off-policy Evaluation
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
Xiaohong Chen
Zhengling Qi
OffRL
91
35
0
17 Jan 2022
Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement
  Learning
Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning
Ziyang Tang
Yihao Feng
Qiang Liu
OffRL
43
1
0
01 Jan 2022
Off Environment Evaluation Using Convex Risk Minimization
Off Environment Evaluation Using Convex Risk Minimization
Pulkit Katdare
Shuijing Liu
Katherine Driggs-Campbell
47
2
0
21 Dec 2021
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting
  Adversarial Imitation for Sample Efficiency
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency
Mingfei Sun
Sam Devlin
Katja Hofmann
Shimon Whiteson
30
4
0
11 Dec 2021
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
93
4
0
29 Nov 2021
A Minimax Learning Approach to Off-Policy Evaluation in Confounded
  Partially Observable Markov Decision Processes
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
C. Shi
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
90
26
0
12 Nov 2021
SOPE: Spectrum of Off-Policy Estimators
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
93
5
0
06 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
107
12
0
04 Nov 2021
Towards Hyperparameter-free Policy Selection for Offline Reinforcement
  Learning
Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning
Siyuan Zhang
Nan Jiang
OffRL
93
39
0
26 Oct 2021
False Correlation Reduction for Offline Reinforcement Learning
False Correlation Reduction for Offline Reinforcement Learning
Arvindkumar Krishnakumar
Zuyue Fu
Lingxiao Wang
Zhuoran Yang
Chenjia Bai
Tianyi Zhou
Judy Hoffman
Jing Jiang
OffRL
76
9
0
24 Oct 2021
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Raghuram Bharadwaj Diddigi
Prateek Jain
P. J
S. Bhatnagar
CMLOffRL
81
3
0
19 Oct 2021
Offline Reinforcement Learning with Soft Behavior Regularization
Offline Reinforcement Learning with Soft Behavior Regularization
Haoran Xu
Xianyuan Zhan
Jianxiong Li
Honglei Yin
OffRL
79
31
0
14 Oct 2021
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CMLFAttOffRL
63
2
0
06 Oct 2021
The $f$-Divergence Reinforcement Learning Framework
The fff-Divergence Reinforcement Learning Framework
Chen Gong
Qiang He
Yunpeng Bai
Zhouyi Yang
Xiaoyu Chen
Xinwen Hou
Xianjie Zhang
Yu Liu
Guoliang Fan
68
3
0
24 Sep 2021
DROMO: Distributionally Robust Offline Model-based Policy Optimization
DROMO: Distributionally Robust Offline Model-based Policy Optimization
Ruizhen Liu
Dazhi Zhong
Zhi-Cong Chen
OffRL
53
3
0
15 Sep 2021
State Relevance for Off-Policy Evaluation
State Relevance for Off-Policy Evaluation
S. Shen
Yecheng Ma
Omer Gottesman
Finale Doshi-Velez
OffRL
59
4
0
13 Sep 2021
Projected State-action Balancing Weights for Offline Reinforcement
  Learning
Projected State-action Balancing Weights for Offline Reinforcement Learning
Jiayi Wang
Zhengling Qi
Raymond K. W. Wong
OffRL
73
19
0
10 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
95
119
0
19 Aug 2021
Truncated Emphatic Temporal Difference Methods for Prediction and
  Control
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Shangtong Zhang
Shimon Whiteson
OffRL
72
12
0
11 Aug 2021
Model Selection for Offline Reinforcement Learning: Practical
  Considerations for Healthcare Settings
Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings
Shengpu Tang
Jenna Wiens
OffRL
91
81
0
23 Jul 2021
Conservative Offline Distributional Reinforcement Learning
Conservative Offline Distributional Reinforcement Learning
Yecheng Jason Ma
Dinesh Jayaraman
Osbert Bastani
OffRL
104
83
0
12 Jul 2021
Learning Expected Emphatic Traces for Deep RL
Learning Expected Emphatic Traces for Deep RL
Ray Jiang
Shangtong Zhang
Veronica Chelu
Adam White
Hado van Hasselt
OffRL
60
12
0
12 Jul 2021
Supervised Off-Policy Ranking
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
63
6
0
03 Jul 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Yifei Min
Tianhao Wang
Dongruo Zhou
Quanquan Gu
OffRL
89
38
0
22 Jun 2021
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
OffRL
57
11
0
22 Jun 2021
The Curse of Passive Data Collection in Batch Reinforcement Learning
The Curse of Passive Data Collection in Batch Reinforcement Learning
Chenjun Xiao
Ilbin Lee
Bo Dai
Dale Schuurmans
Csaba Szepesvári
OffRL
64
1
0
18 Jun 2021
A Deep Reinforcement Learning Approach to Marginalized Importance
  Sampling with the Successor Representation
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Scott Fujimoto
David Meger
Doina Precup
70
17
0
12 Jun 2021
Instrument Space Selection for Kernel Maximum Moment Restriction
Instrument Space Selection for Kernel Maximum Moment Restriction
Rui Zhang
Krikamol Muandet
Bernhard Schölkopf
Masaaki Imaizumi
64
3
0
07 Jun 2021
Previous
123456
Next