ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.04014
  4. Cited By
Statistically Efficient Off-Policy Policy Gradients

Statistically Efficient Off-Policy Policy Gradients

10 February 2020
Nathan Kallus
Masatoshi Uehara
    OffRL
ArXivPDFHTML

Papers citing "Statistically Efficient Off-Policy Policy Gradients"

10 / 10 papers shown
Title
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
165
0
0
01 May 2025
Inference on Optimal Dynamic Policies via Softmax Approximation
Inference on Optimal Dynamic Policies via Softmax Approximation
Qizhao Chen
Morgane Austern
Vasilis Syrgkanis
OffRL
31
1
0
08 Mar 2023
Offline Policy Evaluation and Optimization under Confounding
Offline Policy Evaluation and Optimization under Confounding
Chinmaya Kausik
Yangyi Lu
Kevin Tan
Maggie Makar
Yixin Wang
Ambuj Tewari
OffRL
26
8
0
29 Nov 2022
Review of Metrics to Measure the Stability, Robustness and Resilience of
  Reinforcement Learning
Review of Metrics to Measure the Stability, Robustness and Resilience of Reinforcement Learning
L. Pullum
13
2
0
22 Mar 2022
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
41
24
0
23 Feb 2021
Fast Rates for the Regret of Offline Reinforcement Learning
Fast Rates for the Regret of Offline Reinforcement Learning
Yichun Hu
Nathan Kallus
Masatoshi Uehara
OffRL
13
30
0
31 Jan 2021
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with
  Double Reinforcement Learning
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
21
87
0
12 Sep 2019
Double Reinforcement Learning for Efficient Off-Policy Evaluation in
  Markov Decision Processes
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
38
181
0
22 Aug 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
37
185
0
05 Jun 2019
Off-Policy Actor-Critic
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1