Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.00036
Cited By
Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform
29 September 2023
Shengyi Huang
Jiayi Weng
Rujikorn Charakorn
Min Lin
Zhongwen Xu
Santiago Ontañón
Re-assign community
ArXiv (abs)
PDF
HTML
Github (112★)
Papers citing
"Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform"
14 / 14 papers shown
Title
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch
Shengyi Huang
Sophie Xhonneux
Arian Hosseini
Rishabh Agarwal
Rameswar Panda
OffRL
136
11
0
23 Oct 2024
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Aaron Courville
Marc G. Bellemare
OffRL
118
673
0
30 Aug 2021
Podracer architectures for scalable Reinforcement Learning
Matteo Hessel
M. Kroiss
Aidan Clark
Iurii Kemaev
John Quan
Thomas Keck
Fabio Viola
H. V. Hasselt
54
39
0
13 Apr 2021
High-Throughput Synchronous Deep RL
Iou-Jen Liu
Raymond A. Yeh
Alex Schwing
OffRL
47
12
0
17 Dec 2020
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames
Erik Wijmans
Abhishek Kadian
Ari S. Morcos
Stefan Lee
Irfan Essa
Devi Parikh
Manolis Savva
Dhruv Batra
85
484
0
01 Nov 2019
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
67
278
0
14 Dec 2018
Accelerated Methods for Deep Reinforcement Learning
Adam Stooke
Pieter Abbeel
OffRL
OnRL
59
136
0
07 Mar 2018
Distributed Prioritized Experience Replay
Dan Horgan
John Quan
David Budden
Gabriel Barth-Maron
Matteo Hessel
H. V. Hasselt
David Silver
147
741
0
02 Mar 2018
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
L. Espeholt
Hubert Soyer
Rémi Munos
Karen Simonyan
Volodymyr Mnih
...
Vlad Firoiu
Tim Harley
Iain Dunning
Shane Legg
Koray Kavukcuoglu
220
1,605
0
05 Feb 2018
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
118
1,961
0
19 Sep 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
517
19,237
0
20 Jul 2017
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
202
8,875
0
04 Feb 2016
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
104
3,434
0
08 Jun 2015
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
120
3,020
0
19 Jul 2012
1