Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.01423
Cited By
Value Improved Actor Critic Algorithms
3 June 2024
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
M. Spaan
Wendelin Bohmer
Wendelin Bohmer
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Value Improved Actor Critic Algorithms"
26 / 26 papers shown
Title
For SALE: State-Action Representation Learning for Deep Reinforcement Learning
Scott Fujimoto
Wei-Di Chang
Edward James Smith
S. Gu
Doina Precup
David Meger
OffRL
56
50
0
04 Jun 2023
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu Wang
William Yang Wang
OffRL
55
15
0
29 Nov 2022
Mastering Atari Games with Limited Data
Weirui Ye
Shao-Wei Liu
Thanard Kurutach
Pieter Abbeel
Yang Gao
VLM
97
234
0
30 Oct 2021
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
272
895
0
12 Oct 2021
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
42
30
0
17 Jul 2021
Online and Offline Reinforcement Learning by Planning with a Learned Model
Julian Schrittwieser
Thomas Hubert
Amol Mandhane
M. Barekatain
Ioannis Antonoglou
David Silver
OffRL
53
116
0
13 Apr 2021
Muesli: Combining Improvements in Policy Optimization
Matteo Hessel
Ivo Danihelka
Fabio Viola
A. Guez
Simon Schmitt
Laurent Sifre
T. Weber
David Silver
H. V. Hasselt
57
66
0
13 Apr 2021
Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER
Markus Holzleitner
Lukas Gruber
Jose A. Arjona-Medina
Johannes Brandstetter
Sepp Hochreiter
47
38
0
02 Dec 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
Jean-Bastien Grill
Florent Altché
Yunhao Tang
Thomas Hubert
Michal Valko
Ioannis Antonoglou
Rémi Munos
57
74
0
24 Jul 2020
An operator view of policy gradient methods
Dibya Ghosh
Marlos C. Machado
Nicolas Le Roux
OffRL
33
27
0
19 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
64
58
0
07 May 2020
On the Convergence of Approximate and Regularized Policy Iteration Schemes
E. Smirnova
Elvis Dohmatob
30
5
0
20 Sep 2019
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
116
2,418
0
13 Dec 2018
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
Samuel Neumann
Sungsu Lim
A. Joseph
Yangchen Pan
Adam White
Martha White
102
7
0
22 Oct 2018
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
69
476
0
14 Jun 2018
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
167
5,161
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
256
8,303
0
04 Jan 2018
DeepMind Control Suite
Yuval Tassa
Yotam Doron
Alistair Muldal
Tom Erez
Yazhe Li
...
A. Abdolmaleki
J. Merel
Andrew Lefrancq
Timothy Lillicrap
Martin Riedmiller
ELM
LM&Ro
BDL
120
1,126
0
02 Jan 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
319
18,906
0
20 Jul 2017
Equivalence Between Policy Gradients and Soft Q-Learning
John Schulman
Xi Chen
Pieter Abbeel
OffRL
74
345
0
21 Apr 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
142
470
0
28 Feb 2017
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
328
10,464
0
21 Jul 2016
Taming the Noise in Reinforcement Learning via Soft Updates
Roy Fox
Ari Pakman
Naftali Tishby
55
338
0
28 Dec 2015
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
146
7,621
0
22 Sep 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
283
13,208
0
09 Sep 2015
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
112
12,201
0
19 Dec 2013
1