Is the Policy Gradient a Gradient?

17 June 2019

Papers citing "Is the Policy Gradient a Gradient?"

13 / 13 papers shown

Title
Finite-Sample Analysis of Proximal Gradient TD Algorithms Bo Liu Ji Liu Mohammad Ghavamzadeh Sridhar Mahadevan Marek Petrik 59 158 0 06 Jun 2020
Addressing Function Approximation Error in Actor-Critic Methods Scott Fujimoto H. V. Hoof David Meger OffRL 169 5,178 0 26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Tuomas Haarnoja Aurick Zhou Pieter Abbeel Sergey Levine 292 8,329 0 04 Jan 2018
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation Yuhuai Wu Elman Mansimov Shun Liao Roger C. Grosse Jimmy Ba OffRL 52 626 0 17 Aug 2017
Proximal Policy Optimization Algorithms John Schulman Filip Wolski Prafulla Dhariwal Alec Radford Oleg Klimov OffRL 468 19,006 0 20 Jul 2017
Sample Efficient Actor-Critic with Experience Replay Ziyun Wang V. Bapst N. Heess Volodymyr Mnih Rémi Munos Koray Kavukcuoglu Nando de Freitas 97 761 0 03 Nov 2016
Asynchronous Methods for Deep Reinforcement Learning Volodymyr Mnih Adria Puigdomenech Badia M. Berk Mirza Alex Graves Timothy Lillicrap Tim Harley David Silver Koray Kavukcuoglu 191 8,850 0 04 Feb 2016
Continuous control with deep reinforcement learning Timothy Lillicrap Jonathan J. Hunt Alexander Pritzel N. Heess Tom Erez Yuval Tassa David Silver Daan Wierstra 318 13,234 0 09 Sep 2015
Emphatic Temporal-Difference Learning A. R. Mahmood Huizhen Yu Martha White R. Sutton 151 33 0 06 Jul 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation John Schulman Philipp Moritz Sergey Levine Michael I. Jordan Pieter Abbeel OffRL 84 3,406 0 08 Jun 2015
Trust Region Policy Optimization John Schulman Sergey Levine Philipp Moritz Michael I. Jordan Pieter Abbeel 277 6,764 0 19 Feb 2015
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning Lex Weaver Nigel Tao 117 247 0 10 Jan 2013
The Arcade Learning Environment: An Evaluation Platform for General Agents Marc G. Bellemare Yavar Naddaf J. Veness Michael Bowling 109 3,004 0 19 Jul 2012