ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.01348
  4. Cited By
Examining average and discounted reward optimality criteria in
  reinforcement learning
v1v2 (latest)

Examining average and discounted reward optimality criteria in reinforcement learning

3 July 2021
Vektor Dewanto
M. Gallagher
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Examining average and discounted reward optimality criteria in reinforcement learning"

9 / 9 papers shown
Title
Towards Tight Bounds on the Sample Complexity of Average-reward MDPs
Towards Tight Bounds on the Sample Complexity of Average-reward MDPs
Yujia Jin
Aaron Sidford
35
31
0
13 Jun 2021
Average-reward model-free reinforcement learning: a systematic review
  and literature mapping
Average-reward model-free reinforcement learning: a systematic review and literature mapping
Vektor Dewanto
George Dunn
A. Eshragh
M. Gallagher
Fred Roosta
64
30
0
18 Oct 2020
Zeroth-order Deterministic Policy Gradient
Zeroth-order Deterministic Policy Gradient
Harshat Kumar
Dionysios S. Kalogerias
George J. Pappas
Alejandro Ribeiro
OffRL
27
14
0
12 Jun 2020
Is the Policy Gradient a Gradient?
Is the Policy Gradient a Gradient?
Chris Nota
Philip S. Thomas
76
58
0
17 Jun 2019
Deep Reinforcement Learning that Matters
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
118
1,954
0
19 Sep 2017
Unifying task specification in reinforcement learning
Unifying task specification in reinforcement learning
Martha White
OffRL
52
90
0
07 Sep 2016
OpenAI Gym
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRLODL
223
5,077
0
05 Jun 2016
Infinite-Horizon Policy-Gradient Estimation
Infinite-Horizon Policy-Gradient Estimation
Jonathan Baxter
Peter L. Bartlett
100
811
0
03 Jun 2011
Should one compute the Temporal Difference fix point or minimize the
  Bellman Residual? The unified oblique projection view
Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view
B. Scherrer
82
102
0
19 Nov 2010
1