Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.13284
Cited By
Correcting discount-factor mismatch in on-policy policy gradient methods
23 June 2023
Fengdi Che
Gautham Vasan
A. R. Mahmood
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Correcting discount-factor mismatch in on-policy policy gradient methods"
2 / 2 papers shown
Title
Bayesian Q-learning With Imperfect Expert Demonstrations
Fengdi Che
Xiru Zhu
Doina Precup
David Meger
Gregory Dudek
19
2
0
01 Oct 2022
Learning Expected Emphatic Traces for Deep RL
Ray Jiang
Shangtong Zhang
Veronica Chelu
Adam White
Hado van Hasselt
OffRL
35
12
0
12 Jul 2021
1