Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.14681
Cited By
Value Internalization: Learning and Generalizing from Social Reward
19 July 2024
Frieda Rong
Max Kleiman-Weiner
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Value Internalization: Learning and Generalizing from Social Reward"
6 / 6 papers shown
Title
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
886
13,207
0
04 Mar 2022
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models
Alexander Pan
Kush S. Bhatia
Jacob Steinhardt
94
182
0
10 Jan 2022
Reward-rational (implicit) choice: A unifying formalism for reward learning
Hong Jun Jeon
S. Milli
Anca Dragan
76
177
0
12 Feb 2020
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
541
19,296
0
20 Jul 2017
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
218
3,365
0
12 Jun 2017
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
Tejas D. Kulkarni
Karthik Narasimhan
A. Saeedi
J. Tenenbaum
74
1,137
0
20 Apr 2016
1