Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.08967
Cited By
Efficient iterative policy optimization
28 December 2016
Nicolas Le Roux
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Efficient iterative policy optimization"
3 / 3 papers shown
Title
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
162
6
0
18 Mar 2025
Gradient Estimation Using Stochastic Computation Graphs
John Schulman
N. Heess
T. Weber
Pieter Abbeel
OffRL
148
395
0
17 Jun 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
129
3,438
0
08 Jun 2015
1