ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.14811
  4. Cited By
On the Limitations of Markovian Rewards to Express Multi-Objective,
  Risk-Sensitive, and Modal Tasks

On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks

26 January 2024
Joar Skalse
Alessandro Abate
ArXivPDFHTML

Papers citing "On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks"

9 / 9 papers shown
Title
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh
Pradeep Varakantham
Peter Vamplew
OffRL
34
0
0
02 Mar 2025
The Partially Observable Off-Switch Game
The Partially Observable Off-Switch Game
Andrew Garber
Rohan Subramani
Linus Luu
Mark Bedaywi
Stuart J. Russell
Scott Emmons
73
0
0
25 Nov 2024
Multi-objective Reinforcement learning from AI Feedback
Multi-objective Reinforcement learning from AI Feedback
Marcus Williams
38
1
0
11 Jun 2024
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable
  AI Systems
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
David Dalrymple
Joar Skalse
Yoshua Bengio
Stuart J. Russell
Max Tegmark
...
Clark Barrett
Ding Zhao
Zhi-Xuan Tan
Jeannette Wing
Joshua Tenenbaum
52
52
0
10 May 2024
Numeric Reward Machines
Numeric Reward Machines
Kristina Levina
Nikolaos Pappas
Athanasios Karapantelakis
Aneta Vulgarakis Feljan
Jendrik Seipp
40
1
0
30 Apr 2024
Conditions on Preference Relations that Guarantee the Existence of
  Optimal Policies
Conditions on Preference Relations that Guarantee the Existence of Optimal Policies
Jonathan Colaco Carr
Prakash Panangaden
Doina Precup
26
1
0
03 Nov 2023
Consistent Aggregation of Objectives with Diverse Time Preferences
  Requires Non-Markovian Rewards
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
35
6
0
30 Sep 2023
Defining and Characterizing Reward Hacking
Defining and Characterizing Reward Hacking
Joar Skalse
Nikolaus H. R. Howe
Dmitrii Krasheninnikov
David M. Krueger
59
55
0
27 Sep 2022
Utility Theory for Sequential Decision Making
Utility Theory for Sequential Decision Making
Mehran Shakerinava
Siamak Ravanbakhsh
29
7
0
27 Jun 2022
1