ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,370 papers shown
Title
Language models show human-like content effects on reasoning tasks
Language models show human-like content effects on reasoning tasks
Ishita Dasgupta
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Hannah R. Sheahan
Antonia Creswell
D. Kumaran
James L. McClelland
Felix Hill
ReLMLRM
136
188
0
14 Jul 2022
Inner Monologue: Embodied Reasoning through Planning with Language
  Models
Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAGLM&RoLRM
199
927
0
12 Jul 2022
What is Flagged in Uncertainty Quantification? Latent Density Models for
  Uncertainty Categorization
What is Flagged in Uncertainty Quantification? Latent Density Models for Uncertainty Categorization
Hao Sun
B. V. Breugel
Jonathan Crabbé
Nabeel Seedat
M. Schaar
87
4
0
11 Jul 2022
Big Learning
Big Learning
Yulai Cong
Miaoyun Zhao
AI4CE
94
0
0
08 Jul 2022
BioTABQA: Instruction Learning for Biomedical Table Question Answering
BioTABQA: Instruction Learning for Biomedical Table Question Answering
Man Luo
S. Saxena
Swaroop Mishra
Mihir Parmar
Chitta Baral
LMTD
196
16
0
06 Jul 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
Guosheng Lin
SyDaALM
227
273
0
05 Jul 2022
Rationale-Augmented Ensembles in Language Models
Rationale-Augmented Ensembles in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Denny Zhou
ReLMLRM
121
126
0
02 Jul 2022
BigBIO: A Framework for Data-Centric Biomedical Natural Language
  Processing
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing
Jason Alan Fries
Leon Weber
Natasha Seelam
Gabriel Altay
Debajyoti Datta
...
Minh Chien Vu
Trishala Neeraj
Jonas Golde
Albert Villanova del Moral
Benjamin Beilharz
LM&MA
151
49
0
30 Jun 2022
Aligning Artificial Intelligence with Humans through Public Policy
Aligning Artificial Intelligence with Humans through Public Policy
John J. Nay
James M. Daily
19
2
0
25 Jun 2022
PlanBench: An Extensible Benchmark for Evaluating Large Language Models
  on Planning and Reasoning about Change
PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change
Karthik Valmeekam
Matthew Marquez
Alberto Olmo
S. Sreedharan
Subbarao Kambhampati
ReLMLRM
115
237
0
21 Jun 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELMReLMLRM
322
2,527
0
15 Jun 2022
Language Models are General-Purpose Interfaces
Language Models are General-Purpose Interfaces
Y. Hao
Haoyu Song
Li Dong
Shaohan Huang
Zewen Chi
Wenhui Wang
Shuming Ma
Furu Wei
MLLM
78
102
0
13 Jun 2022
X-Risk Analysis for AI Research
X-Risk Analysis for AI Research
Dan Hendrycks
Mantas Mazeika
77
71
0
13 Jun 2022
Offline RL for Natural Language Generation with Implicit Language Q
  Learning
Offline RL for Natural Language Generation with Implicit Language Q Learning
Charles Burton Snell
Ilya Kostrikov
Yi Su
Mengjiao Yang
Sergey Levine
OffRL
221
115
0
05 Jun 2022
Acquiring and Modelling Abstract Commonsense Knowledge via
  Conceptualization
Acquiring and Modelling Abstract Commonsense Knowledge via Conceptualization
Mutian He
Tianqing Fang
Weiqi Wang
Yangqiu Song
94
30
0
03 Jun 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning
  Language Models with no Catastrophic Forgetting
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
105
57
0
01 Jun 2022
Leveraging Pre-Trained Language Models to Streamline Natural Language
  Interaction for Self-Tracking
Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking
Young-Ho Kim
Sungdong Kim
Minsuk Chang
Sang-Woo Lee
96
5
0
31 May 2022
IGLU 2022: Interactive Grounded Language Understanding in a
  Collaborative Environment at NeurIPS 2022
IGLU 2022: Interactive Grounded Language Understanding in a Collaborative Environment at NeurIPS 2022
Julia Kiseleva
Alexey Skrynnik
Artem Zholus
Shrestha Mohanty
Negar Arabzadeh
...
Aleksandr I. Panov
Yuxuan Sun
Kavya Srinet
Arthur Szlam
Ahmed Hassan Awadallah
LLMAG
59
21
0
27 May 2022
Can Foundation Models Help Us Achieve Perfect Secrecy?
Can Foundation Models Help Us Achieve Perfect Secrecy?
Simran Arora
Christopher Ré
FedML
92
8
0
27 May 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
176
220
0
26 May 2022
Large Language Models are Few-Shot Clinical Information Extractors
Large Language Models are Few-Shot Clinical Information Extractors
Monica Agrawal
S. Hegselmann
Hunter Lang
Yoon Kim
David Sontag
BDLLM&MA
253
351
0
25 May 2022
Ground-Truth Labels Matter: A Deeper Look into Input-Label
  Demonstrations
Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations
Kang Min Yoo
Junyeob Kim
Sungmin Cho
Hyunsoo Cho
Hwiyeol Jo
Sang-Woo Lee
Sang-goo Lee
Taeuk Kim
116
129
0
25 May 2022
Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning
Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning
Jiahui Gao
Renjie Pi
Yong Lin
Hang Xu
Jiacheng Ye
Zhiyong Wu
Weizhong Zhang
Xiaodan Liang
Zhenguo Li
Lingpeng Kong
SyDaVLM
158
50
0
25 May 2022
InstructDial: Improving Zero and Few-shot Generalization in Dialogue
  through Instruction Tuning
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
Prakhar Gupta
Cathy Jiao
Yi-Ting Yeh
Shikib Mehri
M. Eskénazi
Jeffrey P. Bigham
ALM
119
48
0
25 May 2022
QAMPARI: An Open-domain Question Answering Benchmark for Questions with
  Many Answers from Multiple Paragraphs
QAMPARI: An Open-domain Question Answering Benchmark for Questions with Many Answers from Multiple Paragraphs
S. Amouyal
Tomer Wolfson
Ohad Rubin
Ori Yoran
Jonathan Herzig
Jonathan Berant
RALMVLM
88
27
0
25 May 2022
Few-shot Reranking for Multi-hop QA via Language Model Prompting
Few-shot Reranking for Multi-hop QA via Language Model Prompting
Muhammad Khalifa
Lajanugen Logeswaran
Moontae Lee
Ho Hin Lee
Lu Wang
LRM
114
20
0
25 May 2022
Multimodal Knowledge Alignment with Reinforcement Learning
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu
Jiwan Chung
Heeseung Yun
Jack Hessel
Jinho Park
...
Prithviraj Ammanabrolu
Rowan Zellers
Ronan Le Bras
Gunhee Kim
Yejin Choi
VLM
160
37
0
25 May 2022
Non-Programmers Can Label Programs Indirectly via Active Examples: A
  Case Study with Text-to-SQL
Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL
Ruiqi Zhong
Charles Burton Snell
Dan Klein
Jason Eisner
115
9
0
25 May 2022
ClaimDiff: Comparing and Contrasting Claims on Contentious Issues
ClaimDiff: Comparing and Contrasting Claims on Contentious Issues
Miyoung Ko
Ingyu Seong
Hwaran Lee
Joonsuk Park
Minsuk Chang
Minjoon Seo
84
3
0
24 May 2022
Looking for a Handsome Carpenter! Debiasing GPT-3 Job Advertisements
Looking for a Handsome Carpenter! Debiasing GPT-3 Job Advertisements
Conrad Borchers
Dalia Sara Gala
Ben Gilburt
Eduard Oravkin
Wilfried Bounsi
Yuki M. Asano
Hannah Rose Kirk
AI4CE
72
29
0
23 May 2022
Instruction Induction: From Few Examples to Natural Language Task
  Descriptions
Instruction Induction: From Few Examples to Natural Language Task Descriptions
Or Honovich
Uri Shaham
Samuel R. Bowman
Omer Levy
ELMLRM
286
146
0
22 May 2022
Language Models with Image Descriptors are Strong Few-Shot
  Video-Language Learners
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang
Manling Li
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Joey Tianyi Zhou
Heng Ji
MLLMVLM
225
142
0
22 May 2022
Scaling Laws and Interpretability of Learning from Repeated Data
Scaling Laws and Interpretability of Learning from Repeated Data
Danny Hernandez
Tom B. Brown
Tom Conerly
Nova Dassarma
Dawn Drain
...
Catherine Olsson
Dario Amodei
Nicholas Joseph
Jared Kaplan
Sam McCandlish
90
118
0
21 May 2022
RankGen: Improving Text Generation with Large Ranking Models
RankGen: Improving Text Generation with Large Ranking Models
Kalpesh Krishna
Yapei Chang
John Wieting
Mohit Iyyer
AIMat
83
69
0
19 May 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&RoLLMAGAI4CE
217
827
0
12 May 2022
UL2: Unifying Language Learning Paradigms
UL2: Unifying Language Learning Paradigms
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
Xavier Garcia
Jason W. Wei
...
Tal Schuster
H. Zheng
Denny Zhou
N. Houlsby
Donald Metzler
AI4CE
141
313
0
10 May 2022
A Simple Contrastive Learning Objective for Alleviating Neural Text
  Degeneration
A Simple Contrastive Learning Objective for Alleviating Neural Text Degeneration
Shaojie Jiang
Ruqing Zhang
Svitlana Vakulenko
Maarten de Rijke
99
16
0
05 May 2022
Language Models in the Loop: Incorporating Prompting into Weak
  Supervision
Language Models in the Loop: Incorporating Prompting into Weak Supervision
Ryan Smith
Jason Alan Fries
Braden Hancock
Stephen H. Bach
112
56
0
04 May 2022
Improving In-Context Few-Shot Learning via Self-Supervised Training
Improving In-Context Few-Shot Learning via Self-Supervised Training
Mingda Chen
Jingfei Du
Ramakanth Pasunuru
Todor Mihaylov
Srini Iyer
Ves Stoyanov
Zornitsa Kozareva
SSLAI4MH
109
67
0
03 May 2022
Adversarial Training for High-Stakes Reliability
Adversarial Training for High-Stakes Reliability
Daniel M. Ziegler
Seraphina Nix
Lawrence Chan
Tim Bauman
Peter Schmidt-Nielsen
...
Noa Nabeshima
Benjamin Weinstein-Raun
D. Haas
Buck Shlegeris
Nate Thomas
AAML
137
61
0
03 May 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
397
3,714
0
02 May 2022
Training Language Models with Language Feedback
Training Language Models with Language Feedback
Jérémy Scheurer
Jon Ander Campos
Jun Shern Chan
Angelica Chen
Kyunghyun Cho
Ethan Perez
ALM
124
51
0
29 Apr 2022
Standing on the Shoulders of Giant Frozen Language Models
Standing on the Shoulders of Giant Frozen Language Models
Yoav Levine
Itay Dalmedigos
Ori Ram
Yoel Zeldes
Daniel Jannai
...
Barak Lenz
Shai Shalev-Shwartz
Amnon Shashua
Kevin Leyton-Brown
Y. Shoham
VLM
99
49
0
21 Apr 2022
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Bill Yuchen Lin
Kangmin Tan
Chris Miller
Beiwen Tian
Xiang Ren
LRMRALM
84
49
0
17 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions
  on 1600+ NLP Tasks
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
137
864
0
16 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
189
841
0
14 Apr 2022
InCoder: A Generative Model for Code Infilling and Synthesis
InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried
Armen Aghajanyan
Jessy Lin
Sida I. Wang
Eric Wallace
Freda Shi
Ruiqi Zhong
Wen-tau Yih
Luke Zettlemoyer
M. Lewis
SyDa
125
659
0
12 Apr 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning
  from Human Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
262
2,631
0
12 Apr 2022
Can language models learn from explanations in context?
Can language models learn from explanations in context?
Andrew Kyle Lampinen
Ishita Dasgupta
Stephanie C. Y. Chan
Kory Matthewson
Michael Henry Tessler
Antonia Creswell
James L. McClelland
Jane X. Wang
Felix Hill
LRMReLM
190
302
0
05 Apr 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILMLRM
590
6,322
0
05 Apr 2022
Previous
123...126127128
Next