ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,390 papers shown
Title
InstructTODS: Large Language Models for End-to-End Task-Oriented
  Dialogue Systems
InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems
Willy Chung
Samuel Cahyawijaya
Bryan Wilie
Holy Lovenia
Pascale Fung
82
6
0
13 Oct 2023
End-to-end Story Plot Generator
End-to-end Story Plot Generator
Hanlin Zhu
Andrew Cohen
Danqing Wang
Kevin Kaichuang Yang
Xiaomeng Yang
Jiantao Jiao
Yuandong Tian
65
5
0
13 Oct 2023
GLoRE: Evaluating Logical Reasoning of Large Language Models
GLoRE: Evaluating Logical Reasoning of Large Language Models
Hanmeng Liu
Zhiyang Teng
Ruoxi Ning
Jian Liu
Qiji Zhou
Yuexin Zhang
Yue Zhang
ReLMELMLRM
167
8
0
13 Oct 2023
GameGPT: Multi-agent Collaborative Framework for Game Development
GameGPT: Multi-agent Collaborative Framework for Game Development
Dake Chen
Hanbin Wang
Yunhao Huo
Yuzhao Li
Haoyang Zhang
LLMAG
94
19
0
12 Oct 2023
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Wei Ping
Ming-Yu Liu
Lawrence C. McAfee
Peng Xu
Bo Li
Mohammad Shoeybi
Bryan Catanzaro
RALM
118
54
0
11 Oct 2023
The Past, Present and Better Future of Feedback Learning in Large
  Language Models for Subjective Human Preferences and Values
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Hannah Rose Kirk
Andrew M. Bean
Bertie Vidgen
Paul Röttger
Scott A. Hale
ALM
117
50
0
11 Oct 2023
How Do Large Language Models Capture the Ever-changing World Knowledge?
  A Review of Recent Advances
How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances
Zihan Zhang
Meng Fang
Lingxi Chen
Mohammad-Reza Namazi-Rad
Jun Wang
KELM
98
24
0
11 Oct 2023
Beyond Factuality: A Comprehensive Evaluation of Large Language Models
  as Knowledge Generators
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators
Liang Chen
Yang Deng
Yatao Bian
Zeyu Qin
Bingzhe Wu
Tat-Seng Chua
Kam-Fai Wong
HILMELM
121
47
0
11 Oct 2023
Lemur: Harmonizing Natural Language and Code for Language Agents
Lemur: Harmonizing Natural Language and Code for Language Agents
Yiheng Xu
Hongjin Su
Chen Xing
Boyu Mi
Qian Liu
...
Siheng Zhao
Lingpeng Kong
Bailin Wang
Caiming Xiong
Tao Yu
99
74
0
10 Oct 2023
The Limits of ChatGPT in Extracting Aspect-Category-Opinion-Sentiment
  Quadruples: A Comparative Analysis
The Limits of ChatGPT in Extracting Aspect-Category-Opinion-Sentiment Quadruples: A Comparative Analysis
Xiancai Xu
Jia-Dong Zhang
Rongchang Xiao
Lei Xiong
42
6
0
10 Oct 2023
Constructive Large Language Models Alignment with Diverse Feedback
Constructive Large Language Models Alignment with Diverse Feedback
Tianshu Yu
Ting-En Lin
Yuchuan Wu
Min Yang
Fei Huang
Yongbin Li
ALM
104
9
0
10 Oct 2023
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
Eslam Mohamed Bakr
Mohamed Ayman
Mahmoud Ahmed
Habib Slim
Mohamed Elhoseiny
LRM
95
12
0
10 Oct 2023
Diversity from Human Feedback
Diversity from Human Feedback
Ren-Jian Wang
Ke Xue
Yutong Wang
Peng Yang
Haobo Fu
Qiang Fu
Chao Qian
86
3
0
10 Oct 2023
Reinforcement Learning in the Era of LLMs: What is Essential? What is
  needed? An RL Perspective on RLHF, Prompting, and Beyond
Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and Beyond
Hao Sun
OffRL
87
23
0
09 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
120
5
0
09 Oct 2023
FireAct: Toward Language Agent Fine-tuning
FireAct: Toward Language Agent Fine-tuning
Baian Chen
Chang Shu
Ehsan Shareghi
Nigel Collier
Karthik Narasimhan
Shunyu Yao
ALMLLMAG
185
112
0
09 Oct 2023
Fine-grained Audio-Visual Joint Representations for Multimodal Large
  Language Models
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models
Guangzhi Sun
Wenyi Yu
Changli Tang
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Chao Zhang
105
13
0
09 Oct 2023
Improving Summarization with Human Edits
Improving Summarization with Human Edits
Zonghai Yao
Benjamin J Schloss
Sai P. Selvaraj
113
4
0
09 Oct 2023
STREAM: Social data and knowledge collective intelligence platform for
  TRaining Ethical AI Models
STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI Models
Yuwei Wang
Enmeng Lu
Zizhe Ruan
Yao Liang
Yi Zeng
AI4TS
84
4
0
09 Oct 2023
Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained
  Large Language Models with Template-Content Structure
Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure
Haotong Yang
Fanxu Meng
Zhouchen Lin
Muhan Zhang
LRM
109
3
0
09 Oct 2023
CCAE: A Corpus of Chinese-based Asian Englishes
CCAE: A Corpus of Chinese-based Asian Englishes
Yang Liu
Melissa Xiaohui Qin
Long Wang
Chao-Wei Huang
78
0
0
09 Oct 2023
Retrieval-Generation Synergy Augmented Large Language Models
Retrieval-Generation Synergy Augmented Large Language Models
Zhangyin Feng
Xiaocheng Feng
Dezhi Zhao
Maojin Yang
Bing Qin
LRMRALM
51
31
0
08 Oct 2023
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA
  Moderna -- "The New Electricity": Applications, Risks, and Trends in Current
  AI
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA Moderna -- "The New Electricity": Applications, Risks, and Trends in Current AI
A. Bazzan
Anderson R. Tavares
André G. Pereira
C. R. Jung
Jacob Scharcanski
J. Carbonera
Luís C. Lamb
Mariana Recamonde Mendoza
T. L. T. D. Silveira
V. P. Moreira
56
0
0
08 Oct 2023
Crystal: Introspective Reasoners Reinforced with Self-Feedback
Crystal: Introspective Reasoners Reinforced with Self-Feedback
Jiacheng Liu
Ramakanth Pasunuru
Hannaneh Hajishirzi
Yejin Choi
Asli Celikyilmaz
LRMReLM
79
24
0
07 Oct 2023
ForeSeer: Product Aspect Forecasting Using Temporal Graph Embedding
ForeSeer: Product Aspect Forecasting Using Temporal Graph Embedding
Zixuan Liu
Gaurush Hiranandani
Kun Qian
E-Wen Huang
Yi Xu
Belinda Zeng
Karthik Subbian
Sheng Wang
AI4TS
86
1
0
07 Oct 2023
Parameter Efficient Multi-task Model Fusion with Partial Linearization
Parameter Efficient Multi-task Model Fusion with Partial Linearization
Anke Tang
Li Shen
Yong Luo
Yibing Zhan
Han Hu
Bo Du
Yixin Chen
Dacheng Tao
MoMe
124
36
0
07 Oct 2023
Automatic and Human-AI Interactive Text Generation
Automatic and Human-AI Interactive Text Generation
Yao Dou
Philippe Laban
Claire Gardent
Wei Xu
82
4
0
05 Oct 2023
A Long Way to Go: Investigating Length Correlations in RLHF
A Long Way to Go: Investigating Length Correlations in RLHF
Prasann Singhal
Tanya Goyal
Jiacheng Xu
Greg Durrett
160
161
0
05 Oct 2023
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Alexander Robey
Eric Wong
Hamed Hassani
George J. Pappas
AAML
207
260
0
05 Oct 2023
Concise and Organized Perception Facilitates Reasoning in Large Language Models
Concise and Organized Perception Facilitates Reasoning in Large Language Models
Junjie Liu
Shaotian Yan
Chen Shen
Zhengdong Xiao
Wenxiao Wang
Jieping Ye
Jieping Ye
LRM
125
1
0
05 Oct 2023
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models
Saaket Agashe
Yue Fan
Anthony Reyna
Xin Eric Wang
LLMAGLRM
164
16
0
05 Oct 2023
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
Chang Gao
Wenxuan Zhang
Guizhen Chen
Wai Lam
224
6
0
04 Oct 2023
A Survey of GPT-3 Family Large Language Models Including ChatGPT and
  GPT-4
A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4
Katikapalli Subramanyam Kalyan
LM&MAAI4CELRMAILawELM
131
248
0
04 Oct 2023
Large language models in textual analysis for gesture selection
Large language models in textual analysis for gesture selection
Laura Birka Hensel
Nutchanon Yongsatianchot
P. Torshizi
E. Minucci
Stacy Marsella
SLR
87
8
0
04 Oct 2023
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Anastasios Kyrillidis
Robert Sim
212
6
0
04 Oct 2023
Reward Model Ensembles Help Mitigate Overoptimization
Reward Model Ensembles Help Mitigate Overoptimization
Thomas Coste
Usman Anwar
Robert Kirk
David M. Krueger
NoLaALM
118
139
0
04 Oct 2023
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Hao Sha
Yao Mu
Yuxuan Jiang
Li Chen
Chenfeng Xu
Ping Luo
Shengbo Eben Li
Masayoshi Tomizuka
Wei Zhan
Mingyu Ding
265
179
0
04 Oct 2023
Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation
Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation
Benjamin Steenhoek
Michele Tufano
Neel Sundaresan
Alexey Svyatkovskiy
OffRLALM
153
22
0
03 Oct 2023
Automatic Pair Construction for Contrastive Post-training
Automatic Pair Construction for Contrastive Post-training
Canwen Xu
Corby Rosset
Ethan C. Chau
Luciano Del Corro
Shweti Mahajan
Julian McAuley
Jennifer Neville
Ahmed Hassan Awadallah
Nikhil Rao
ALM
67
4
0
03 Oct 2023
Dimensions of Disagreement: Unpacking Divergence and Misalignment in
  Cognitive Science and Artificial Intelligence
Dimensions of Disagreement: Unpacking Divergence and Misalignment in Cognitive Science and Artificial Intelligence
Kerem Oktar
Ilia Sucholutsky
Tania Lombrozo
Thomas Griffiths
AI4CE
279
3
0
03 Oct 2023
Investigating Large Language Models' Perception of Emotion Using
  Appraisal Theory
Investigating Large Language Models' Perception of Emotion Using Appraisal Theory
Nutchanon Yongsatianchot
P. Torshizi
Stacy Marsella
LLMAG
81
12
0
03 Oct 2023
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
Rafael Ferreira
Diogo Tavares
Diogo Glória-Silva
Rodrigo Valerio
João Bordalo
Ines Simoes
Vasco Ramos
David Semedo
João Magalhães
45
4
0
03 Oct 2023
Can large language models provide useful feedback on research papers? A
  large-scale empirical analysis
Can large language models provide useful feedback on research papers? A large-scale empirical analysis
Weixin Liang
Yuhui Zhang
Hancheng Cao
Binglu Wang
Daisy Ding
...
Siyu He
D. Smith
Yian Yin
Daniel A. McFarland
James Y. Zou
ALMLM&MA
105
146
0
03 Oct 2023
HPC-GPT: Integrating Large Language Model for High-Performance Computing
HPC-GPT: Integrating Large Language Model for High-Performance Computing
Xianzhong Ding
Le Chen
M. Emani
Chunhua Liao
Pei-Hung Lin
T. Vanderbruggen
Zhen Xie
Alberto Cerpa
Wan Du
LM&MAAI4MH
83
29
0
03 Oct 2023
LoFT: Local Proxy Fine-tuning For Improving Transferability Of
  Adversarial Attacks Against Large Language Model
LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Muhammad Ahmed Shah
Roshan S. Sharma
Hira Dhamyal
R. Olivier
Ankit Shah
...
Massa Baali
Soham Deshmukh
Michael Kuhlmann
Bhiksha Raj
Rita Singh
AAML
67
21
0
02 Oct 2023
Tool-Augmented Reward Modeling
Tool-Augmented Reward Modeling
Lei Li
Yekun Chai
Shuohuan Wang
Yu Sun
Hao Tian
Ningyu Zhang
Hua Wu
OffRL
115
14
0
02 Oct 2023
Language Model Decoding as Direct Metrics Optimization
Language Model Decoding as Direct Metrics Optimization
Haozhe Ji
Pei Ke
Hongning Wang
Minlie Huang
65
7
0
02 Oct 2023
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
Qiwei Di
Tao Jin
Yue Wu
Heyang Zhao
Farzad Farnoud
Quanquan Gu
100
14
0
02 Oct 2023
Enabling Language Models to Implicitly Learn Self-Improvement
Enabling Language Models to Implicitly Learn Self-Improvement
Ziqi Wang
Le Hou
Tianjian Lu
Yuexin Wu
Yunxuan Li
Hongkun Yu
Heng Ji
ReLMLRM
67
6
0
02 Oct 2023
Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech
Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech
Dareen Alharthi
Roshan S. Sharma
Hira Dhamyal
Soumi Maiti
Bhiksha Raj
Rita Singh
77
4
0
01 Oct 2023
Previous
123...112113114...126127128
Next