Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.15969
Cited By
Iterative Reward Shaping using Human Feedback for Correcting Reward Misspecification
30 August 2023
Jasmina Gajcin
J. McCarthy
Rahul Nair
Radu Marinescu
Elizabeth M. Daly
Ivana Dusparic
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Iterative Reward Shaping using Human Feedback for Correcting Reward Misspecification"
12 / 12 papers shown
Title
Learning Rewards to Optimize Global Performance Metrics in Deep Reinforcement Learning
Junqi Qian
Paul Weng
Chenmien Tan
71
1
0
16 Mar 2023
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models
Alexander Pan
Kush S. Bhatia
Jacob Steinhardt
102
182
0
10 Jan 2022
Reinforcement Learning with Trajectory Feedback
Yonathan Efroni
Nadav Merlis
Shie Mannor
87
45
0
13 Aug 2020
Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation
L. Guan
Mudit Verma
Sihang Guo
Ruohan Zhang
Subbarao Kambhampati
101
43
0
26 Jun 2020
Deep Reinforcement Learning for Autonomous Driving: A Survey
B. R. Kiran
Ibrahim Sobh
V. Talpaert
Patrick Mannion
A. A. Sallab
S. Yogamani
P. Pérez
358
1,689
0
02 Feb 2020
FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback
Baicen Xiao
Qifan Lu
Bhaskar Ramasubramanian
Andrew Clark
L. Bushnell
Radha Poovendran
64
25
0
19 Jan 2020
Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents' Capabilities and Limitations
Pedro Sequeira
Melinda Gervasio
73
105
0
19 Dec 2019
A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress
Saurabh Arora
Prashant Doshi
OffRL
93
612
0
18 Jun 2018
Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces
Garrett A. Warnell
Nicholas R. Waytowich
Vernon J. Lawhern
Peter Stone
72
272
0
28 Sep 2017
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
136
2,830
0
19 Aug 2017
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
218
3,377
0
12 Jun 2017
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
256
2,405
0
21 Jun 2016
1