Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,372 papers shown
Title
Fine-tuning language models to find agreement among humans with diverse preferences
Michiel A. Bakker
Martin Chadwick
Hannah R. Sheahan
Michael Henry Tessler
Lucy Campbell-Gillingham
...
Nat McAleese
Amelia Glaese
John Aslanides
M. Botvinick
Christopher Summerfield
ALM
110
237
0
28 Nov 2022
Solving math word problems with process- and outcome-based feedback
J. Uesato
Nate Kushman
Ramana Kumar
Francis Song
Noah Y. Siegel
L. Wang
Antonia Creswell
G. Irving
I. Higgins
FaML
ReLM
AIMat
LRM
135
362
0
25 Nov 2022
Complementary Explanations for Effective In-Context Learning
Xi Ye
Srini Iyer
Asli Celikyilmaz
Ves Stoyanov
Greg Durrett
Ramakanth Pasunuru
ReLM
LRM
112
96
0
25 Nov 2022
Leveraging Data Recasting to Enhance Tabular Reasoning
Aashna Jena
Vivek Gupta
Manish Shrivastava
Julian Martin Eisenschlos
LMTD
47
6
0
23 Nov 2022
HyperTuning: Toward Adapting Large Language Models without Back-propagation
Jason Phang
Yi Mao
Pengcheng He
Weizhu Chen
96
34
0
22 Nov 2022
Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback
Josh Abramson
Arun Ahuja
Federico Carnevale
Petko Georgiev
Alex Goldin
...
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Rui Zhu
81
29
0
21 Nov 2022
Deanthropomorphising NLP: Can a Language Model Be Conscious?
Matthew Shardlow
Piotr Przybyła
64
7
0
21 Nov 2022
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
Qianhui Wu
Huiqiang Jiang
Haonan Yin
Börje F. Karlsson
Chin-Yew Lin
115
12
0
21 Nov 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
325
1,843
0
17 Nov 2022
UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning
Ziyao Wang
Thai Le
Dongwon Lee
88
1
0
17 Nov 2022
Ignore Previous Prompt: Attack Techniques For Language Models
Fábio Perez
Ian Ribeiro
SILM
106
452
0
17 Nov 2022
Task-aware Retrieval with Instructions
Akari Asai
Timo Schick
Patrick Lewis
Xilun Chen
Gautier Izacard
Sebastian Riedel
Hannaneh Hajishirzi
Wen-tau Yih
109
98
0
16 Nov 2022
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
Xingcheng Xu
81
0
0
16 Nov 2022
GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective
Linyi Yang
Shuibai Zhang
Libo Qin
Yafu Li
Yidong Wang
Hanmeng Liu
Jindong Wang
Xingxu Xie
Yue Zhang
ELM
188
82
0
15 Nov 2022
A taxonomic system for failure cause analysis of open source AI incidents
Nikiforos Pittaras
Sean McGregor
46
10
0
14 Nov 2022
The CRINGE Loss: Learning what language not to model
Leonard Adolphs
Tianyu Gao
Jing Xu
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
MU
95
37
0
10 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
Avia Efrat
Or Honovich
Omer Levy
105
20
0
03 Nov 2022
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
Andrei Ioan Muresanu
Ziwen Han
Keiran Paster
Silviu Pitis
Harris Chan
Jimmy Ba
ALM
LLMAG
195
906
0
03 Nov 2022
Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language Instructions
Alexey Skrynnik
Zoya Volovikova
Marc-Alexandre Côté
Anton Voronov
Artem Zholus
...
Milagro Teruel
Ahmed Hassan Awadallah
Aleksandr I. Panov
Andrey Kravchenko
Julia Kiseleva
LM&Ro
107
11
0
01 Nov 2022
CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation
Abhilasha Ravichander
Matt Gardner
Ana Marasović
112
35
0
01 Nov 2022
GPS: Genetic Prompt Search for Efficient Few-shot Learning
Hanwei Xu
Yujun Chen
Yulun Du
Nan Shao
Yanggang Wang
Haiyu Li
Zhilin Yang
VLM
63
31
0
31 Oct 2022
Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences
L. Guan
Karthik Valmeekam
Subbarao Kambhampati
97
8
0
28 Oct 2022
When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels
Weiyan Shi
Emily Dinan
Kurt Shuster
Jason Weston
Jing Xu
116
20
0
28 Oct 2022
Can language models handle recursively nested grammatical structures? A case study on comparing models and humans
Andrew Kyle Lampinen
ReLM
ELM
125
36
0
27 Oct 2022
Will we run out of data? Limits of LLM scaling based on human-generated data
Pablo Villalobos
A. Ho
J. Sevilla
T. Besiroglu
Lennart Heim
Marius Hobbhahn
ALM
102
125
0
26 Oct 2022
RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering
Victor Zhong
Weijia Shi
Wen-tau Yih
Luke Zettlemoyer
106
21
0
25 Oct 2022
Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing
Tuhin Chakrabarty
Vishakh Padmakumar
Hengxing He
86
82
0
25 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Baihan Lin
OffRL
AI4TS
129
27
0
24 Oct 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Maarten Sap
Ronan Le Bras
Daniel Fried
Yejin Choi
101
232
0
24 Oct 2022
NVIDIA FLARE: Federated Learning from Simulation to Real-World
H. Roth
Yan Cheng
Yuhong Wen
Isaac Yang
Ziyue Xu
...
Daguang Xu
Nic Ma
Prerna Dogra
Mona G. Flores
Andrew Feng
FedML
AI4CE
97
101
0
24 Oct 2022
Leveraging Large Language Models for Multiple Choice Question Answering
Joshua Robinson
Christopher Rytting
David Wingate
ELM
248
200
0
22 Oct 2022
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models
Alessandro Stolfo
Zhijing Jin
Kumar Shridhar
Bernhard Schölkopf
Mrinmaya Sachan
ELM
OOD
LRM
145
66
0
21 Oct 2022
Graphically Structured Diffusion Models
Christian D. Weilbach
William Harvey
Frank Wood
DiffM
87
7
0
20 Oct 2022
Boosting Natural Language Generation from Instructions with Meta-Learning
Budhaditya Deb
Guoqing Zheng
Ahmed Hassan Awadallah
72
16
0
20 Oct 2022
Large Language Models Can Self-Improve
Jiaxin Huang
S. Gu
Le Hou
Yuexin Wu
Xuezhi Wang
Hongkun Yu
Jiawei Han
ReLM
AI4MH
LRM
226
618
0
20 Oct 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
314
3,178
0
20 Oct 2022
Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay
Jason W. Wei
Hyung Won Chung
Vinh Q. Tran
David R. So
...
Donald Metzler
Slav Petrov
N. Houlsby
Quoc V. Le
Mostafa Dehghani
LRM
109
71
0
20 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
108
24
0
19 Oct 2022
Scaling Laws for Reward Model Overoptimization
Leo Gao
John Schulman
Jacob Hilton
ALM
131
569
0
19 Oct 2022
TabLLM: Few-shot Classification of Tabular Data with Large Language Models
S. Hegselmann
Alejandro Buendia
Hunter Lang
Monica Agrawal
Xiaoyi Jiang
David Sontag
LMTD
135
235
0
19 Oct 2022
Towards a neural architecture of language: Deep learning versus logistics of access in neural architectures for compositional processing
F. Velde
43
0
0
19 Oct 2022
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Luke Vilnis
Yury Zemlyanskiy
Patrick C. Murray
Alexandre Passos
Sumit Sanghai
105
10
0
18 Oct 2022
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
298
1,144
0
17 Oct 2022
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi R. Fung
Tuhin Chakraborty
Hao Guo
Owen Rambow
Smaranda Muresan
Heng Ji
86
43
0
16 Oct 2022
The Debate Over Understanding in AI's Large Language Models
Melanie Mitchell
D. Krakauer
ELM
159
223
0
14 Oct 2022
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
189
91
0
14 Oct 2022
Pretrained Transformers Do not Always Improve Robustness
Swaroop Mishra
Bhavdeep Singh Sachdeva
Chitta Baral
VLM
58
2
0
14 Oct 2022
"John is 50 years old, can his son be 65?" Evaluating NLP Models' Understanding of Feasibility
Himanshu Gupta
Neeraj Varshney
Swaroop Mishra
Kuntal Kumar Pal
Saurabh Arjun Sawant
Kevin Scaria
Siddharth Goyal
Chitta Baral
ELM
104
14
0
14 Oct 2022
Large Language Models are few(1)-shot Table Reasoners
Wenhu Chen
LMTD
ReLM
LRM
93
153
0
13 Oct 2022
PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs (Journal Version)
Hao Cui
R. Trimananda
A. Markopoulou
Scott Jordan
89
18
0
13 Oct 2022
Previous
1
2
3
...
124
125
126
127
128
Next