ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.08526
  4. Cited By
Double Reinforcement Learning for Efficient Off-Policy Evaluation in
  Markov Decision Processes
v1v2v3 (latest)

Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes

22 August 2019
Nathan Kallus
Masatoshi Uehara
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes"

21 / 21 papers shown
Title
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
195
0
0
02 May 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
238
2
0
22 Feb 2025
Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings
Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings
Henrik von Kleist
Alireza Zamanian
I. Shpitser
Narges Ahmidi
OffRL
183
2
0
03 Dec 2023
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement
  Learning
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Ming Yin
Yu Wang
OffRL
106
82
0
29 Jan 2020
More Efficient Off-Policy Evaluation through Regularized Targeted
  Learning
More Efficient Off-Policy Evaluation through Regularized Targeted Learning
Aurélien F. Bibaut
Ivana Malenica
N. Vlassis
Mark van der Laan
OODOffRL
42
41
0
13 Dec 2019
Fast rates for empirical risk minimization over càdlàg functions
  with bounded sectional variation norm
Fast rates for empirical risk minimization over càdlàg functions with bounded sectional variation norm
Aurélien F. Bibaut
Mark van der Laan
37
5
0
22 Jul 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for
  Reinforcement Learning
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
81
54
0
09 Jun 2019
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with
  Marginalized Importance Sampling
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
97
181
0
08 Jun 2019
Characterization of parameters with a mixed bias property
Characterization of parameters with a mixed bias property
A. Rotnitzky
Ezequiel Smucler
J. M. Robins
66
67
0
07 Apr 2019
Batch Policy Learning under Constraints
Batch Policy Learning under Constraints
Hoang Minh Le
Cameron Voloshin
Yisong Yue
OffRL
58
333
0
20 Mar 2019
Non-Parametric Inference Adaptive to Intrinsic Dimension
Non-Parametric Inference Adaptive to Intrinsic Dimension
Khashayar Khosravi
Greg Lewis
Vasilis Syrgkanis
453
7
0
11 Jan 2019
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
158
356
0
29 Oct 2018
Deep Neural Networks Learn Non-Smooth Functions Effectively
Deep Neural Networks Learn Non-Smooth Functions Effectively
Masaaki Imaizumi
Kenji Fukumizu
145
124
0
13 Feb 2018
More Robust Doubly Robust Off-policy Evaluation
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
76
267
0
10 Feb 2018
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
112
222
0
04 Dec 2016
Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Daniel J. Luckett
Eric B. Laber
A. Kahkoska
D. Maahs
E. Mayer‐Davis
Michael R. Kosorok
67
137
0
10 Nov 2016
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
138
615
0
08 Jun 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
424
576
0
04 Apr 2016
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Nan Jiang
Lihong Li
OffRL
202
623
0
11 Nov 2015
Adaptive Concentration of Regression Trees, with Application to Random
  Forests
Adaptive Concentration of Regression Trees, with Application to Random Forests
Stefan Wager
G. Walther
109
25
0
22 Mar 2015
Doubly Robust Policy Evaluation and Optimization
Doubly Robust Policy Evaluation and Optimization
Miroslav Dudík
D. Erhan
John Langford
Lihong Li
OffRL
182
286
0
10 Mar 2015
1