ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.09570
  4. Cited By
Calculus on MDPs: Potential Shaping as a Gradient
v1v2 (latest)

Calculus on MDPs: Potential Shaping as a Gradient

20 August 2022
Erik Jenner
H. V. Hoof
Adam Gleave
ArXiv (abs)PDFHTML

Papers citing "Calculus on MDPs: Potential Shaping as a Gradient"

11 / 11 papers shown
Title
Preprocessing Reward Functions for Interpretability
Preprocessing Reward Functions for Interpretability
Erik Jenner
Adam Gleave
121
8
0
25 Mar 2022
Invariance in Policy Optimisation and Partial Identifiability in Reward
  Learning
Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
Joar Skalse
Matthew Farrugia-Roberts
Stuart J. Russell
Alessandro Abate
Adam Gleave
56
48
0
14 Mar 2022
Dynamics-Aware Comparison of Learned Reward Functions
Dynamics-Aware Comparison of Learned Reward Functions
Blake Wulfe
Ashwin Balakrishna
Logan Ellis
Jean Mercat
R. McAllister
Adrien Gaidon
28
15
0
25 Jan 2022
Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks
Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks
Ingmar Schubert
Ozgur S. Oguz
Marc Toussaint
OffRL
55
6
0
14 Jul 2021
Identifiability in inverse reinforcement learning
Identifiability in inverse reinforcement learning
Haoyang Cao
Samuel N. Cohen
Lukasz Szpruch
45
47
0
07 Jun 2021
Understanding Learned Reward Functions
Understanding Learned Reward Functions
Eric J. Michaud
Adam Gleave
Stuart J. Russell
XAIOffRL
67
34
0
10 Dec 2020
Reward Machines: Exploiting Reward Function Structure in Reinforcement
  Learning
Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
Rodrigo Toro Icarte
Toryn Q. Klassen
Richard Valenzano
Sheila A. McIlraith
OffRL
114
222
0
06 Oct 2020
Quantifying Differences in Reward Functions
Quantifying Differences in Reward Functions
Adam Gleave
Michael Dennis
Shane Legg
Stuart J. Russell
Jan Leike
OffRL
124
68
0
24 Jun 2020
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
474
1,766
0
18 Sep 2019
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement
  Learning from Observations
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
Daniel S. Brown
Wonjoon Goo
P. Nagarajan
S. Niekum
76
358
0
12 Apr 2019
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
216
3,364
0
12 Jun 2017
1