ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.12429
  4. Cited By
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

29 October 2018
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation"

50 / 252 papers shown
Title
Telescoping Density-Ratio Estimation
Telescoping Density-Ratio Estimation
Benjamin Rhodes
Kai Xu
Michael U. Gutmann
171
97
0
22 Jun 2020
Off-Policy Self-Critical Training for Transformer in Visual Paragraph
  Generation
Off-Policy Self-Critical Training for Transformer in Visual Paragraph Generation
Shiyang Yan
Yang Hua
N. Robertson
OffRL
42
0
0
21 Jun 2020
A maximum-entropy approach to off-policy evaluation in average-reward
  MDPs
A maximum-entropy approach to off-policy evaluation in average-reward MDPs
N. Lazić
Dong Yin
Mehrdad Farajtabar
Nir Levine
Dilan Görür
Chris Harris
Dale Schuurmans
OffRL
47
11
0
17 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRLOnRL
150
1,838
0
08 Jun 2020
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
  Policies
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
Nathan Kallus
Masatoshi Uehara
OffRL
42
15
0
06 Jun 2020
Efficient Evaluation of Natural Stochastic Policies in Offline
  Reinforcement Learning
Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
66
9
0
06 Jun 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRLGP
578
2,049
0
04 May 2020
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
95
38
0
22 Apr 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement
  Learning
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
114
33
0
24 Mar 2020
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved
  Confounding
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding
Hongseok Namkoong
Ramtin Keramati
Steve Yadlowsky
Emma Brunskill
OffRL
175
65
0
12 Mar 2020
Batch Stationary Distribution Estimation
Batch Stationary Distribution Estimation
Junfeng Wen
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
100
23
0
02 Mar 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
166
152
0
21 Feb 2020
GenDICE: Generalized Offline Estimation of Stationary Values
GenDICE: Generalized Offline Estimation of Stationary Values
Ruiyi Zhang
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
199
174
0
21 Feb 2020
Adaptive Estimator Selection for Off-Policy Evaluation
Adaptive Estimator Selection for Off-Policy Evaluation
Yi-Hsun Su
Pavithra Srinath
A. Krishnamurthy
OffRL
62
48
0
18 Feb 2020
Adaptive Experience Selection for Policy Gradient
Adaptive Experience Selection for Policy Gradient
S. Mohamad
Giovanni Montana
99
0
0
17 Feb 2020
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement
  Learning
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning
Nathan Kallus
Angela Zhou
OffRL
116
60
0
11 Feb 2020
Statistically Efficient Off-Policy Policy Gradients
Statistically Efficient Off-Policy Policy Gradients
Nathan Kallus
Masatoshi Uehara
OffRL
110
39
0
10 Feb 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Nan Jiang
Jiawei Huang
OffRL
207
17
0
06 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
  Learning Framework
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CMLOffRL
122
40
0
05 Feb 2020
GradientDICE: Rethinking Generalized Offline Estimation of Stationary
  Values
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
Shangtong Zhang
Bo Liu
Shimon Whiteson
OffRL
104
103
0
29 Jan 2020
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement
  Learning
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Ming Yin
Yu Wang
OffRL
124
82
0
29 Jan 2020
A Nonparametric Off-Policy Policy Gradient
A Nonparametric Off-Policy Policy Gradient
Samuele Tosatto
João Carvalho
Hany Abdulsamad
Jan Peters
OffRL
59
10
0
08 Jan 2020
Off-Policy Estimation of Long-Term Average Outcomes with Applications to
  Mobile Health
Off-Policy Estimation of Long-Term Average Outcomes with Applications to Mobile Health
Peng Liao
P. Klasnja
Susan Murphy
OffRL
76
68
0
30 Dec 2019
AlgaeDICE: Policy Gradient from Arbitrary Experience
AlgaeDICE: Policy Gradient from Arbitrary Experience
Ofir Nachum
Bo Dai
Ilya Kostrikov
Yinlam Chow
Lihong Li
Dale Schuurmans
OffRL
166
245
0
04 Dec 2019
Behavior Regularized Offline Reinforcement Learning
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
120
691
0
26 Nov 2019
Off-Policy Policy Gradient Algorithms by Constraining the State
  Distribution Shift
Off-Policy Policy Gradient Algorithms by Constraining the State Distribution Shift
Riashat Islam
Komal K. Teru
Deepak Sharma
Joelle Pineau
OffRL
78
8
0
16 Nov 2019
Empirical Study of Off-Policy Policy Evaluation for Reinforcement
  Learning
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
80
154
0
15 Nov 2019
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function
  Approximation
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
Shangtong Zhang
Bo Liu
Hengshuai Yao
Shimon Whiteson
OffRL
106
8
0
11 Nov 2019
SMIX($λ$): Enhancing Centralized Value Functions for Cooperative
  Multi-Agent Reinforcement Learning
SMIX(λλλ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning
Xinghu Yao
Chao Wen
Yuhui Wang
Xiaoyang Tan
107
46
0
11 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
179
187
0
28 Oct 2019
Bridging the Gap Between $f$-GANs and Wasserstein GANs
Bridging the Gap Between fff-GANs and Wasserstein GANs
Jiaming Song
Stefano Ermon
100
40
0
22 Oct 2019
Adaptive Trade-Offs in Off-Policy Learning
Adaptive Trade-Offs in Off-Policy Learning
Mark Rowland
Will Dabney
Rémi Munos
OffRL
126
22
0
16 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
166
69
0
16 Oct 2019
Understanding the Curse of Horizon in Off-Policy Evaluation via
  Conditional Importance Sampling
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Yao Liu
Pierre-Luc Bacon
Emma Brunskill
OffRL
98
47
0
15 Oct 2019
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior
  Policies
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
Xinyun Chen
Lu Wang
Yizhe Hang
Heng Ge
H. Zha
OffRL
88
5
0
10 Oct 2019
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with
  Double Reinforcement Learning
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
117
93
0
12 Sep 2019
Off-Policy Evaluation in Partially Observable Environments
Off-Policy Evaluation in Partially Observable Environments
Guy Tennenholtz
Shie Mannor
Uri Shalit
OffRL
83
87
0
09 Sep 2019
Double Reinforcement Learning for Efficient Off-Policy Evaluation in
  Markov Decision Processes
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
128
187
0
22 Aug 2019
Incremental Intervention Effects in Studies with Dropout and Many
  Timepoints
Incremental Intervention Effects in Studies with Dropout and Many Timepoints
Kwangho Kim
Edward H. Kennedy
A. Naimi
26
7
0
09 Jul 2019
Importance Resampling for Off-policy Prediction
Importance Resampling for Off-policy Prediction
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
OffRL
57
41
0
11 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
157
339
0
10 Jun 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for
  Reinforcement Learning
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
99
54
0
09 Jun 2019
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with
  Marginalized Importance Sampling
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
125
181
0
08 Jun 2019
Don't Forget Your Teacher: A Corrective Reinforcement Learning Framework
Don't Forget Your Teacher: A Corrective Reinforcement Learning Framework
M. Nazari
Majid Jahani
L. Snyder
Martin Takáč
OffRLOnRL
23
1
0
30 May 2019
Learning When-to-Treat Policies
Learning When-to-Treat Policies
Xinkun Nie
Emma Brunskill
Stefan Wager
CMLOffRL
82
92
0
23 May 2019
Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Omer Gottesman
Yao Liu
Scott Sussex
Emma Brunskill
Finale Doshi-Velez
OffRL
105
36
0
14 May 2019
Information-Theoretic Considerations in Batch Reinforcement Learning
Information-Theoretic Considerations in Batch Reinforcement Learning
Jinglin Chen
Nan Jiang
OODOffRL
182
378
0
01 May 2019
Off-Policy Policy Gradient with State Distribution Correction
Off-Policy Policy Gradient with State Distribution Correction
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
161
67
0
17 Apr 2019
Generalized Off-Policy Actor-Critic
Generalized Off-Policy Actor-Critic
Shangtong Zhang
Wendelin Bohmer
Shimon Whiteson
OffRLCML
132
43
0
27 Mar 2019
Batch Policy Learning under Constraints
Batch Policy Learning under Constraints
Hoang Minh Le
Cameron Voloshin
Yisong Yue
OffRL
85
336
0
20 Mar 2019
Previous
123456
Next