ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.09013
  4. Cited By
An Off-policy Policy Gradient Theorem Using Emphatic Weightings

An Off-policy Policy Gradient Theorem Using Emphatic Weightings

22 November 2018
Ehsan Imani
Eric Graves
Martha White
    OffRL
ArXivPDFHTML

Papers citing "An Off-policy Policy Gradient Theorem Using Emphatic Weightings"

12 / 12 papers shown
Title
Off-Policy Deep Reinforcement Learning Algorithms for Handling Various
  Robotic Manipulator Tasks
Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks
Altun Rzayev
Vahid Tavakol Aghaei
OffRL
21
0
0
11 Dec 2022
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Hsin-En Su
Yen-Ju Chen
Ping-Chun Hsieh
Xi Liu
OffRL
26
0
0
10 Dec 2022
General Policy Evaluation and Improvement by Learning to Identify Few
  But Crucial States
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
Francesco Faccio
Aditya A. Ramesh
Vincent Herrmann
J. Harb
Jürgen Schmidhuber
OffRL
44
8
0
04 Jul 2022
A Temporal-Difference Approach to Policy Gradient Estimation
A Temporal-Difference Approach to Policy Gradient Estimation
Samuele Tosatto
Andrew Patterson
Martha White
A. R. Mahmood
OffRL
27
2
0
04 Feb 2022
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Raghuram Bharadwaj Diddigi
Prateek Jain
P. J
S. Bhatnagar
CML
OffRL
16
3
0
19 Oct 2021
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning
  Method
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method
Ziwei Guan
Tengyu Xu
Yingbin Liang
17
4
0
13 Oct 2021
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Romain Laroche
Rémi Tachet des Combes
46
8
0
29 Sep 2021
Greedification Operators for Policy Optimization: Investigating Forward
  and Reverse KL Divergences
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
25
29
0
17 Jul 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with
  Density-Ratio Correction
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
19
5
0
02 Jun 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
65
29
0
26 May 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
47
24
0
23 Feb 2021
Direct and indirect reinforcement learning
Direct and indirect reinforcement learning
Yang Guan
Shengbo Eben Li
Jingliang Duan
Jie Li
Yangang Ren
Qi Sun
B. Cheng
OffRL
38
34
0
23 Dec 2019
1