Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.08458
Cited By
v1
v2 (latest)
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
11 October 2024
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both"
11 / 61 papers shown
Title
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
OffRL
157
570
0
01 Oct 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
491
1,770
0
18 Sep 2019
Neural Text Generation with Unlikelihood Training
Sean Welleck
Ilia Kulikov
Stephen Roller
Emily Dinan
Kyunghyun Cho
Jason Weston
MU
68
583
0
12 Aug 2019
The Curious Case of Neural Text Degeneration
Ari Holtzman
Jan Buys
Li Du
Maxwell Forbes
Yejin Choi
207
3,213
0
22 Apr 2019
Parameter-Efficient Transfer Learning for NLP
N. Houlsby
A. Giurgiu
Stanislaw Jastrzebski
Bruna Morrone
Quentin de Laroussilhe
Andrea Gesmundo
Mona Attariyan
Sylvain Gelly
223
4,529
0
02 Feb 2019
Decoupled Weight Decay Regularization
I. Loshchilov
Frank Hutter
OffRL
154
2,158
0
14 Nov 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
571
19,315
0
20 Jul 2017
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
218
3,377
0
12 Jun 2017
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
256
2,405
0
21 Jun 2016
Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond
Ramesh Nallapati
Bowen Zhou
Cicero Nogueira dos Santos
Çağlar Gülçehre
Bing Xiang
AIMat
294
2,568
0
19 Feb 2016
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
367
19,745
0
09 Mar 2015
Previous
1
2