ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.05652
  4. Cited By
Learning Human Objectives by Evaluating Hypothetical Behavior

Learning Human Objectives by Evaluating Hypothetical Behavior

5 December 2019
S. Reddy
Anca Dragan
Sergey Levine
Shane Legg
Jan Leike
ArXivPDFHTML

Papers citing "Learning Human Objectives by Evaluating Hypothetical Behavior"

21 / 21 papers shown
Title
Learning Interpretable Models of Aircraft Handling Behaviour by
  Reinforcement Learning from Human Feedback
Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback
Tom Bewley
J. Lawry
Arthur G. Richards
30
1
0
26 May 2023
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards
  and Ethical Behavior in the MACHIAVELLI Benchmark
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Alexander Pan
Chan Jun Shern
Andy Zou
Nathaniel Li
Steven Basart
Thomas Woodside
Jonathan Ng
Hanlin Zhang
Scott Emmons
Dan Hendrycks
35
127
0
06 Apr 2023
A Human-Centered Safe Robot Reinforcement Learning Framework with
  Interactive Behaviors
A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors
Shangding Gu
Alap Kshirsagar
Yali Du
Guang Chen
Jan Peters
Alois C. Knoll
36
14
0
25 Feb 2023
On The Fragility of Learned Reward Functions
On The Fragility of Learned Reward Functions
Lev McKinney
Yawen Duan
David M. Krueger
Adam Gleave
33
20
0
09 Jan 2023
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Daniel Shin
Anca Dragan
Daniel S. Brown
OffRL
17
53
0
03 Jan 2023
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking
David Zhang
Micah Carroll
Andreea Bobu
Anca Dragan
26
4
0
30 Nov 2022
Reward Learning with Trees: Methods and Evaluation
Reward Learning with Trees: Methods and Evaluation
Tom Bewley
J. Lawry
Arthur G. Richards
R. Craddock
Ian Henderson
23
1
0
03 Oct 2022
Law Informs Code: A Legal Informatics Approach to Aligning Artificial
  Intelligence with Humans
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans
John J. Nay
ELM
AILaw
88
27
0
14 Sep 2022
Negative Human Rights as a Basis for Long-term AI Safety and Regulation
Negative Human Rights as a Basis for Long-term AI Safety and Regulation
Ondrej Bajgar
Jan Horenovsky
FaML
24
9
0
31 Aug 2022
Forecasting Future World Events with Neural Networks
Forecasting Future World Events with Neural Networks
Andy Zou
Tristan Xiao
Ryan Jia
Joe Kwon
Mantas Mazeika
Richard Li
Dawn Song
Jacob Steinhardt
Owain Evans
Dan Hendrycks
30
22
0
30 Jun 2022
Aligning to Social Norms and Values in Interactive Narratives
Aligning to Social Norms and Values in Interactive Narratives
Prithviraj Ammanabrolu
Liwei Jiang
Maarten Sap
Hannaneh Hajishirzi
Yejin Choi
AI4CE
28
47
0
04 May 2022
Safe Deep RL in 3D Environments using Human Feedback
Safe Deep RL in 3D Environments using Human Feedback
Matthew Rahtz
Vikrant Varma
Ramana Kumar
Zachary Kenton
Shane Legg
Jan Leike
32
4
0
20 Jan 2022
Inducing Structure in Reward Learning by Learning Features
Inducing Structure in Reward Learning by Learning Features
Andreea Bobu
Marius Wiggert
Claire Tomlin
Anca Dragan
27
30
0
18 Jan 2022
On Optimizing Interventions in Shared Autonomy
On Optimizing Interventions in Shared Autonomy
Weihao Tan
David Koleczek
Siddhant Pradhan
Nicholas Perello
Vivek Chettiar
Vishal Rohra
Aaslesha Rajaram
Soundararajan Srinivasan
H. M. S. Hossain
Yash Chandak
31
4
0
16 Dec 2021
Learning Perceptual Concepts by Bootstrapping from Human Queries
Learning Perceptual Concepts by Bootstrapping from Human Queries
Andreea Bobu
Chris Paxton
Wei Yang
Balakumar Sundaralingam
Yu-Wei Chao
Maya Cakmak
Dieter Fox
SSL
35
17
0
09 Nov 2021
B-Pref: Benchmarking Preference-Based Reinforcement Learning
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
42
93
0
04 Nov 2021
Play to Grade: Testing Coding Games as Classifying Markov Decision
  Process
Play to Grade: Testing Coding Games as Classifying Markov Decision Process
Allen Nie
Emma Brunskill
Chris Piech
29
11
0
27 Oct 2021
What Would Jiminy Cricket Do? Towards Agents That Behave Morally
What Would Jiminy Cricket Do? Towards Agents That Behave Morally
Dan Hendrycks
Mantas Mazeika
Andy Zou
Sahil Patel
Christine Zhu
Jesus Navarro
D. Song
Bo-wen Li
Jacob Steinhardt
16
58
0
25 Oct 2021
Avoiding Tampering Incentives in Deep RL via Decoupled Approval
Avoiding Tampering Incentives in Deep RL via Decoupled Approval
J. Uesato
Ramana Kumar
Victoria Krakovna
Tom Everitt
Richard Ngo
Shane Legg
26
14
0
17 Nov 2020
Feature Expansive Reward Learning: Rethinking Human Input
Feature Expansive Reward Learning: Rethinking Human Input
Andreea Bobu
Marius Wiggert
Claire Tomlin
Anca Dragan
27
44
0
23 Jun 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,683
0
05 Dec 2016
1