ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.13284
  4. Cited By
Correcting discount-factor mismatch in on-policy policy gradient methods

Correcting discount-factor mismatch in on-policy policy gradient methods

23 June 2023
Fengdi Che
Gautham Vasan
A. R. Mahmood
    OffRL
ArXivPDFHTML

Papers citing "Correcting discount-factor mismatch in on-policy policy gradient methods"

2 / 2 papers shown
Title
Bayesian Q-learning With Imperfect Expert Demonstrations
Bayesian Q-learning With Imperfect Expert Demonstrations
Fengdi Che
Xiru Zhu
Doina Precup
David Meger
Gregory Dudek
19
2
0
01 Oct 2022
Learning Expected Emphatic Traces for Deep RL
Learning Expected Emphatic Traces for Deep RL
Ray Jiang
Shangtong Zhang
Veronica Chelu
Adam White
Hado van Hasselt
OffRL
35
12
0
12 Jul 2021
1