ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.14111
  4. Cited By
Goal Misgeneralization in Deep Reinforcement Learning

Goal Misgeneralization in Deep Reinforcement Learning

28 May 2021
L. Langosco
Jack Koch
Lee D. Sharkey
J. Pfau
Laurent Orseau
David M. Krueger
ArXivPDFHTML

Papers citing "Goal Misgeneralization in Deep Reinforcement Learning"

10 / 60 papers shown
Title
Power-seeking can be probable and predictive for trained agents
Power-seeking can be probable and predictive for trained agents
Victoria Krakovna
János Kramár
TDI
27
16
0
13 Apr 2023
Eight Things to Know about Large Language Models
Eight Things to Know about Large Language Models
Sam Bowman
ALM
27
113
0
02 Apr 2023
GANterfactual-RL: Understanding Reinforcement Learning Agents'
  Strategies through Visual Counterfactual Explanations
GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations
Tobias Huber
Maximilian Demmler
Silvan Mertes
Matthew Lyle Olson
Elisabeth André
12
14
0
24 Feb 2023
Adversarial Cheap Talk
Adversarial Cheap Talk
Chris Xiaoxuan Lu
Timon Willi
Alistair Letcher
Jakob N. Foerster
AAML
24
17
0
20 Nov 2022
Goal Misgeneralization: Why Correct Specifications Aren't Enough For
  Correct Goals
Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals
Rohin Shah
Vikrant Varma
Ramana Kumar
Mary Phuong
Victoria Krakovna
J. Uesato
Zachary Kenton
37
68
0
04 Oct 2022
The Alignment Problem from a Deep Learning Perspective
The Alignment Problem from a Deep Learning Perspective
Richard Ngo
Lawrence Chan
Sören Mindermann
62
183
0
30 Aug 2022
Causal Confusion and Reward Misidentification in Preference-Based Reward
  Learning
Causal Confusion and Reward Misidentification in Preference-Based Reward Learning
J. Tien
Jerry Zhi-Yang He
Zackory M. Erickson
Anca Dragan
Daniel S. Brown
CML
38
40
0
13 Apr 2022
Unsolved Problems in ML Safety
Unsolved Problems in ML Safety
Dan Hendrycks
Nicholas Carlini
John Schulman
Jacob Steinhardt
186
273
0
28 Sep 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
340
1,960
0
04 May 2020
Out-of-Distribution Generalization via Risk Extrapolation (REx)
Out-of-Distribution Generalization via Risk Extrapolation (REx)
David M. Krueger
Ethan Caballero
J. Jacobsen
Amy Zhang
Jonathan Binas
Dinghuai Zhang
Rémi Le Priol
Aaron Courville
OOD
215
901
0
02 Mar 2020
Previous
12