ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.13773
  4. Cited By
Beyond variance reduction: Understanding the true impact of baselines on
  policy optimization

Beyond variance reduction: Understanding the true impact of baselines on policy optimization

31 August 2020
Wesley Chung
Valentin Thomas
Marlos C. Machado
Nicolas Le Roux
    OffRL
ArXivPDFHTML

Papers citing "Beyond variance reduction: Understanding the true impact of baselines on policy optimization"

7 / 7 papers shown
Title
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
89
2
0
18 Mar 2025
Behind the Myth of Exploration in Policy Gradients
Behind the Myth of Exploration in Policy Gradients
Adrien Bolland
Gaspard Lambrechts
Damien Ernst
53
0
0
31 Jan 2024
Target-independent XLA optimization using Reinforcement Learning
Target-independent XLA optimization using Reinforcement Learning
Milan Ganai
Haichen Li
Theodore Enns
Yida Wang
Randy Huang
39
0
0
28 Aug 2023
The Role of Baselines in Policy Gradient Optimization
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
29
15
0
16 Jan 2023
When Bioprocess Engineering Meets Machine Learning: A Survey from the
  Perspective of Automated Bioprocess Development
When Bioprocess Engineering Meets Machine Learning: A Survey from the Perspective of Automated Bioprocess Development
Nghia Duong-Trung
Stefan Born
Jong Woo Kim
M. Schermeyer
Katharina Paulick
...
Thorben Werner
Randolf Scholz
Lars Schmidt-Thieme
Peter Neubauer
Ernesto Martinez
34
20
0
02 Sep 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method
  with Probabilistic Gradient Estimation
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
Matilde Gargiani
Andrea Zanelli
Andrea Martinelli
Tyler H. Summers
John Lygeros
33
14
0
01 Feb 2022
Knowledge Infused Policy Gradients with Upper Confidence Bound for
  Relational Bandits
Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits
Kaushik Roy
Qi Zhang
Manas Gaur
A. Sheth
OffRL
28
15
0
25 Jun 2021
1