On the Convergence of Approximate and Regularized Policy Iteration Schemes

20 September 2019

Papers citing "On the Convergence of Approximate and Regularized Policy Iteration Schemes"

11 / 11 papers shown

Title
Value Improved Actor Critic Algorithms Yaniv Oren Moritz A. Zanger Pascal R. van der Vaart M. Spaan Wendelin Bohmer Wendelin Bohmer OffRL 63 0 0 03 Jun 2024
A Theory of Regularized Markov Decision Processes Matthieu Geist B. Scherrer Olivier Pietquin 109 325 0 31 Jan 2019
Soft Actor-Critic Algorithms and Applications Tuomas Haarnoja Aurick Zhou Kristian Hartikainen George Tucker Sehoon Ha ... Vikash Kumar Henry Zhu Abhishek Gupta Pieter Abbeel Sergey Levine 133 2,422 0 13 Dec 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Tuomas Haarnoja Aurick Zhou Pieter Abbeel Sergey Levine 284 8,313 0 04 Jan 2018
Boltzmann Exploration Done Right Nicolò Cesa-Bianchi Claudio Gentile Gábor Lugosi Gergely Neu 82 168 0 29 May 2017
A unified view of entropy-regularized Markov decision processes Gergely Neu Anders Jonsson Vicencc Gómez 93 262 0 22 May 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning Ofir Nachum Mohammad Norouzi Kelvin Xu Dale Schuurmans 152 470 0 28 Feb 2017
Reinforcement Learning with Deep Energy-Based Policies Tuomas Haarnoja Haoran Tang Pieter Abbeel Sergey Levine 95 1,339 0 27 Feb 2017
Taming the Noise in Reinforcement Learning via Soft Updates Roy Fox Ari Pakman Naftali Tishby 67 338 0 28 Dec 2015
Increasing the Action Gap: New Operators for Reinforcement Learning Marc G. Bellemare Georg Ostrovski A. Guez Philip S. Thomas Rémi Munos 68 157 0 15 Dec 2015
Dynamic Policy Programming M. G. Azar Vicencc Gómez H. Kappen 95 123 0 12 Apr 2010