Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.01348
Cited By
v1
v2 (latest)
Examining average and discounted reward optimality criteria in reinforcement learning
3 July 2021
Vektor Dewanto
M. Gallagher
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Examining average and discounted reward optimality criteria in reinforcement learning"
9 / 9 papers shown
Title
Towards Tight Bounds on the Sample Complexity of Average-reward MDPs
Yujia Jin
Aaron Sidford
28
31
0
13 Jun 2021
Average-reward model-free reinforcement learning: a systematic review and literature mapping
Vektor Dewanto
George Dunn
A. Eshragh
M. Gallagher
Fred Roosta
64
30
0
18 Oct 2020
Zeroth-order Deterministic Policy Gradient
Harshat Kumar
Dionysios S. Kalogerias
George J. Pappas
Alejandro Ribeiro
OffRL
27
14
0
12 Jun 2020
Is the Policy Gradient a Gradient?
Chris Nota
Philip S. Thomas
76
58
0
17 Jun 2019
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
118
1,954
0
19 Sep 2017
Unifying task specification in reinforcement learning
Martha White
OffRL
52
90
0
07 Sep 2016
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
223
5,077
0
05 Jun 2016
Infinite-Horizon Policy-Gradient Estimation
Jonathan Baxter
Peter L. Bartlett
100
811
0
03 Jun 2011
Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view
B. Scherrer
82
102
0
19 Nov 2010
1