Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.11266
Cited By
An operator view of policy gradient methods
19 June 2020
Dibya Ghosh
Marlos C. Machado
Nicolas Le Roux
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An operator view of policy gradient methods"
5 / 5 papers shown
Title
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
128
5
0
18 Mar 2025
Value Improved Actor Critic Algorithms
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
M. Spaan
Wendelin Bohmer
Wendelin Bohmer
OffRL
63
0
0
03 Jun 2024
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
59
320
0
01 Aug 2019
Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
Tom Schaul
Diana Borsa
Joseph Modayil
Razvan Pascanu
53
63
0
25 Apr 2019
Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review
Sergey Levine
AI4CE
BDL
73
671
0
02 May 2018
1