ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,380 papers shown
Title
Scenarios and Approaches for Situated Natural Language Explanations
Scenarios and Approaches for Situated Natural Language Explanations
Pengshuo Qiu
Frank Rudzicz
Zining Zhu
LRM
95
0
0
07 Jun 2024
CHIQ: Contextual History Enhancement for Improving Query Rewriting in
  Conversational Search
CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search
Fengran Mo
Abbas Ghaddar
Kelong Mao
Mehdi Rezagholizadeh
Boxing Chen
Qun Liu
Jian-Yun Nie
107
16
0
07 Jun 2024
Through the Thicket: A Study of Number-Oriented LLMs derived from Random
  Forest Models
Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models
M. Romaszewski
Przemysław Sekuła
P. Głomb
M. Cholewa
Katarzyna Kołodziej
68
0
0
07 Jun 2024
RU-AI: A Large Multimodal Dataset for Machine Generated Content
  Detection
RU-AI: A Large Multimodal Dataset for Machine Generated Content Detection
Liting Huang
Zhihao Zhang
Yiran Zhang
Xiyue Zhou
Shoujin Wang
NoLa
81
4
0
07 Jun 2024
Do Language Models Exhibit Human-like Structural Priming Effects?
Do Language Models Exhibit Human-like Structural Priming Effects?
Jaap Jumelet
Willem H. Zuidema
Arabella J. Sinclair
93
11
0
07 Jun 2024
FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large
  Language Models
FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models
Guangyi Liu
Rui Ge
Xinyu Zhu
Jingyi Chai
Yaxin Du
Yang Liu
Yanfeng Wang
Siheng Chen
FedML
108
19
0
07 Jun 2024
MGIMM: Multi-Granularity Instruction Multimodal Model for
  Attribute-Guided Remote Sensing Image Detailed Description
MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description
Cong Yang
Zuchao Li
Lefei Zhang
74
2
0
07 Jun 2024
Mixture-of-Agents Enhances Large Language Model Capabilities
Mixture-of-Agents Enhances Large Language Model Capabilities
Junlin Wang
Jue Wang
Ben Athiwaratkun
Ce Zhang
James Zou
LLMAGAIFin
104
138
0
07 Jun 2024
Large Language Model-guided Document Selection
Large Language Model-guided Document Selection
Xiang Kong
Tom Gunter
Ruoming Pang
70
4
0
07 Jun 2024
Extroversion or Introversion? Controlling The Personality of Your Large
  Language Models
Extroversion or Introversion? Controlling The Personality of Your Large Language Models
Yanquan Chen
Zhen Wu
Junjie Guo
Shujian Huang
Xinyu Dai
46
0
0
07 Jun 2024
Learning Task Decomposition to Assist Humans in Competitive Programming
Learning Task Decomposition to Assist Humans in Competitive Programming
Jiaxin Wen
Ruiqi Zhong
Pei Ke
Zhihong Shao
Hongning Wang
Minlie Huang
ReLM
126
9
0
07 Jun 2024
Proofread: Fixes All Errors with One Tap
Proofread: Fixes All Errors with One Tap
Renjie Liu
Yanxiang Zhang
Yun Zhu
Haicheng Sun
Yuanbo Zhang
Michael Xuelin Huang
Shanqing Cai
Lei Meng
Shumin Zhai
ALM
72
3
0
06 Jun 2024
FLUID-LLM: Learning Computational Fluid Dynamics with
  Spatiotemporal-aware Large Language Models
FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models
Max Zhu
A. Bazaga
Pietro Liò
AI4CE
103
3
0
06 Jun 2024
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and
  Effective for LMMs
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs
Lingchen Meng
Jianwei Yang
Rui Tian
Xiyang Dai
Zuxuan Wu
Jianfeng Gao
Yu-Gang Jiang
VLM
90
9
0
06 Jun 2024
PaCE: Parsimonious Concept Engineering for Large Language Models
PaCE: Parsimonious Concept Engineering for Large Language Models
Jinqi Luo
Tianjiao Ding
Kwan Ho Ryan Chan
D. Thaker
Aditya Chattopadhyay
Chris Callison-Burch
René Vidal
CVBM
98
12
0
06 Jun 2024
Improving Alignment and Robustness with Circuit Breakers
Improving Alignment and Robustness with Circuit Breakers
Andy Zou
Long Phan
Justin Wang
Derek Duenas
Maxwell Lin
Maksym Andriushchenko
Rowan Wang
Zico Kolter
Matt Fredrikson
Dan Hendrycks
AAML
147
114
0
06 Jun 2024
VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
Yueze Wang
Zheng Liu
Shitao Xiao
Bo Zhao
Yongping Xiong
114
29
0
06 Jun 2024
Characterizing Similarities and Divergences in Conversational Tones in
  Humans and LLMs by Sampling with People
Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People
Dun-Ming Huang
Pol van Rijn
Ilia Sucholutsky
Raja Marjieh
Nori Jacoby
64
2
0
06 Jun 2024
Self-Play with Adversarial Critic: Provable and Scalable Offline
  Alignment for Language Models
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models
Xiang Ji
Sanjeev Kulkarni
Mengdi Wang
Tengyang Xie
OffRL
110
5
0
06 Jun 2024
Open-Endedness is Essential for Artificial Superhuman Intelligence
Open-Endedness is Essential for Artificial Superhuman Intelligence
Edward Hughes
Michael Dennis
Jack Parker-Holder
Feryal M. P. Behbahani
Aditi Mavalankar
Yuge Shi
Tom Schaul
Tim Rocktaschel
LRM
106
33
0
06 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELMALM
96
56
0
06 Jun 2024
mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation
  Strategy by Language Models and Humans
mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans
Yusuke Sakai
Hidetaka Kamigaito
Taro Watanabe
LRM
96
5
0
06 Jun 2024
Aligning Agents like Large Language Models
Aligning Agents like Large Language Models
Adam Jelley
Yuhan Cao
Dave Bignell
Sam Devlin
Tabish Rashid
LM&Ro
117
1
0
06 Jun 2024
Confabulation: The Surprising Value of Large Language Model
  Hallucinations
Confabulation: The Surprising Value of Large Language Model Hallucinations
Peiqi Sui
Eamon Duede
Sophie Wu
Richard Jean So
HILMLLMAG
88
23
0
06 Jun 2024
Prototypical Reward Network for Data-Efficient RLHF
Prototypical Reward Network for Data-Efficient RLHF
Jinghan Zhang
Xiting Wang
Yiqiao Jin
Changyu Chen
Xinhao Zhang
Kunpeng Liu
ALM
91
22
0
06 Jun 2024
AgentGym: Evolving Large Language Model-based Agents across Diverse
  Environments
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Zhiheng Xi
Yiwen Ding
Wenxiang Chen
Boyang Hong
Honglin Guo
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yu-Gang Jiang
LLMAGLM&Ro
119
42
0
06 Jun 2024
Towards Understanding Task-agnostic Debiasing Through the Lenses of
  Intrinsic Bias and Forgetfulness
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness
Guangliang Liu
Milad Afshari
Xitong Zhang
Zhiyu Xue
Avrajit Ghosh
Bidhan Bashyal
Rongrong Wang
K. Johnson
78
0
0
06 Jun 2024
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in
  Large Language Models
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
Jisu Shin
Hoyun Song
Huije Lee
Soyeong Jeong
Jong C. Park
112
9
0
06 Jun 2024
XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the
  Multilingual Generation of News Headlines and Tags
XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags
Faisal Tareque Shohan
Mir Tafseer Nayeem
Samsul Islam
Abu Ubaida Akash
Shafiq Joty
76
4
0
06 Jun 2024
Effective Context Selection in LLM-based Leaderboard Generation: An
  Empirical Study
Effective Context Selection in LLM-based Leaderboard Generation: An Empirical Study
Salomon Kabongo
Jennifer D'Souza
Sören Auer
75
5
0
06 Jun 2024
Exploring the Latest LLMs for Leaderboard Extraction
Exploring the Latest LLMs for Leaderboard Extraction
Salomon Kabongo
Jennifer D'Souza
Sören Auer
56
2
0
06 Jun 2024
Efficient Knowledge Infusion via KG-LLM Alignment
Efficient Knowledge Infusion via KG-LLM Alignment
Zhouyu Jiang
Ling Zhong
Mengshu Sun
Jun Xu
Rui Sun
Hui Cai
Shuhan Luo
Qing Cui
65
10
0
06 Jun 2024
LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text
  Classification
LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification
Chun Liu
Hongguang Zhang
Kainan Zhao
Xinghai Ju
Lin Yang
82
4
0
06 Jun 2024
JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against
  Diffusion Model Edits
JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits
Minzhou Pan
Yi Zeng
Xue Lin
Ning Yu
Cho-Jui Hsieh
Peter Henderson
Ruoxi Jia
WIGM
131
4
0
06 Jun 2024
Generalization-Enhanced Code Vulnerability Detection via Multi-Task
  Instruction Fine-Tuning
Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning
Xiaohu Du
Ming Wen
Jiahao Zhu
Zifan Xie
Shezheng Song
Bin Ji
Xuanhua Shi
Hai Jin
91
17
0
06 Jun 2024
A Survey on Medical Large Language Models: Technology, Application,
  Trustworthiness, and Future Directions
A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions
Lei Liu
Xiaoyan Yang
Junchi Lei
Xiaoyang Liu
Yue Shen
...
Peng Wei
Jinjie Gu
Zhixuan Chu
Zhan Qin
Kui Ren
LM&MAAILaw
100
18
0
06 Jun 2024
Excluding the Irrelevant: Focusing Reinforcement Learning through
  Continuous Action Masking
Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking
Roland Stolz
Hanna Krasowski
Jakob Thumm
Michael Eichelbeck
Philipp Gassert
Matthias Althoff
CLL
46
4
0
06 Jun 2024
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and
  Knowledge Recall in Large Language Models via Question Answering
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering
Anand Subramanian
Viktor Schlegel
Abhinav Ramesh Kashyap
Thanh-Tung Nguyen
Vijay Prakash Dwivedi
Stefan Winkler
ELMLM&MAAI4MH
66
3
0
06 Jun 2024
A Survey of Language-Based Communication in Robotics
A Survey of Language-Based Communication in Robotics
William Hunt
Sarvapali D. Ramchurn
Mohammad D. Soorati
LM&Ro
256
13
0
06 Jun 2024
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Chen Cecilia Liu
Iryna Gurevych
Anna Korhonen
188
6
0
06 Jun 2024
Ranking Manipulation for Conversational Search Engines
Ranking Manipulation for Conversational Search Engines
Samuel Pfrommer
Yatong Bai
Tanmay Gautam
Somayeh Sojoudi
SILM
102
5
0
05 Jun 2024
Wings: Learning Multimodal LLMs without Text-only Forgetting
Wings: Learning Multimodal LLMs without Text-only Forgetting
Yi-Kai Zhang
Shiyin Lu
Yang Li
Yanqing Ma
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
VLM
128
10
0
05 Jun 2024
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement
  Learning from Machine Feedback
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback
Timon Ziegenbein
Gabriella Skitalinskaya
Alireza Bayat Makou
Henning Wachsmuth
LLMAGKELM
101
8
0
05 Jun 2024
Llumnix: Dynamic Scheduling for Large Language Model Serving
Llumnix: Dynamic Scheduling for Large Language Model Serving
Biao Sun
Ziming Huang
Hanyu Zhao
Wencong Xiao
Xinyi Zhang
Yong Li
Wei Lin
93
57
0
05 Jun 2024
Assessing the Emergent Symbolic Reasoning Abilities of Llama Large
  Language Models
Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models
Flavio Petruzzellis
Alberto Testolin
A. Sperduti
ReLMLRM
90
3
0
05 Jun 2024
Which Side Are You On? A Multi-task Dataset for End-to-End Argument
  Summarisation and Evaluation
Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation
Hao Li
Yuping Wu
Viktor Schlegel
Riza Batista-Navarro
Tharindu Madusanka
...
Jiayan Zeng
Xiaochi Wang
Xinran He
Yizhi Li
Goran Nenadic
109
8
0
05 Jun 2024
DEER: A Delay-Resilient Framework for Reinforcement Learning with
  Variable Delays
DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays
Bo Xia
Yilun Kong
Yongzhe Chang
Bo Yuan
Zhiheng Li
Xueqian Wang
Bin Liang
OffRL
106
3
0
05 Jun 2024
FragRel: Exploiting Fragment-level Relations in the External Memory of
  Large Language Models
FragRel: Exploiting Fragment-level Relations in the External Memory of Large Language Models
Xihang Yue
Linchao Zhu
Yi Yang
KELM
85
0
0
05 Jun 2024
Exploring Human-AI Perception Alignment in Sensory Experiences: Do LLMs
  Understand Textile Hand?
Exploring Human-AI Perception Alignment in Sensory Experiences: Do LLMs Understand Textile Hand?
Shu Zhong
Elia Gatti
Youngjun Cho
Marianna Obrist
81
3
0
05 Jun 2024
Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional
  Chaining
Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining
Shuqi Liu
Bowei He
Linqi Song
LRM
79
1
0
05 Jun 2024
Previous
123...697071...126127128
Next