ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.09123
  4. Cited By
A2C is a special case of PPO

A2C is a special case of PPO

18 May 2022
Shengyi Huang
Anssi Kanervisto
Antonin Raffin
Weixun Wang
Santiago Ontañón
Rousslan Fernand Julien Dossa
    OffRL
ArXivPDFHTML

Papers citing "A2C is a special case of PPO"

8 / 8 papers shown
Title
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch
Shengyi Huang
Sophie Xhonneux
Arian Hosseini
Rishabh Agarwal
Rameswar Panda
OffRL
113
10
0
23 Oct 2024
Strategies for Using Proximal Policy Optimization in Mobile Puzzle Games
Strategies for Using Proximal Policy Optimization in Mobile Puzzle Games
J. Kristensen
Paolo Burelli
OffRL
37
15
0
03 Jul 2020
MushroomRL: Simplifying Reinforcement Learning Research
MushroomRL: Simplifying Reinforcement Learning Research
Carlo DÉramo
Davide Tateo
Andrea Bonarini
Marcello Restelli
Jan Peters
OffRL
41
85
0
04 Jan 2020
Google Research Football: A Novel Reinforcement Learning Environment
Google Research Football: A Novel Reinforcement Learning Environment
Karol Kurach
Anton Raichuk
Piotr Stańczyk
Michal Zajac
Olivier Bachem
...
C. Riquelme
Damien Vincent
Marcin Michalski
Olivier Bousquet
Sylvain Gelly
133
402
0
25 Jul 2019
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
444
18,931
0
20 Jul 2017
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
189
8,833
0
04 Feb 2016
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
82
3,399
0
08 Jun 2015
Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
114
12,201
0
19 Dec 2013
1