Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.09123
Cited By
A2C is a special case of PPO
18 May 2022
Shengyi Huang
Anssi Kanervisto
Antonin Raffin
Weixun Wang
Santiago Ontañón
Rousslan Fernand Julien Dossa
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A2C is a special case of PPO"
8 / 8 papers shown
Title
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch
Shengyi Huang
Sophie Xhonneux
Arian Hosseini
Rishabh Agarwal
Rameswar Panda
OffRL
113
10
0
23 Oct 2024
Strategies for Using Proximal Policy Optimization in Mobile Puzzle Games
J. Kristensen
Paolo Burelli
OffRL
37
15
0
03 Jul 2020
MushroomRL: Simplifying Reinforcement Learning Research
Carlo DÉramo
Davide Tateo
Andrea Bonarini
Marcello Restelli
Jan Peters
OffRL
41
85
0
04 Jan 2020
Google Research Football: A Novel Reinforcement Learning Environment
Karol Kurach
Anton Raichuk
Piotr Stańczyk
Michal Zajac
Olivier Bachem
...
C. Riquelme
Damien Vincent
Marcin Michalski
Olivier Bousquet
Sylvain Gelly
133
402
0
25 Jul 2019
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
444
18,931
0
20 Jul 2017
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
189
8,833
0
04 Feb 2016
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
82
3,399
0
08 Jun 2015
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
114
12,201
0
19 Dec 2013
1