ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,390 papers shown
Title
Reformatted Alignment
Reformatted Alignment
Run-Ze Fan
Xuefeng Li
Haoyang Zou
Junlong Li
Shwai He
Ethan Chern
Jiewen Hu
Pengfei Liu
102
7
0
19 Feb 2024
Polarization of Autonomous Generative AI Agents Under Echo Chambers
Polarization of Autonomous Generative AI Agents Under Echo Chambers
Masaya Ohagi
LLMAG
65
8
0
19 Feb 2024
Enhancing Multilingual Capabilities of Large Language Models through
  Self-Distillation from Resource-Rich Languages
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages
Yuan Zhang
Yile Wang
Zijun Liu
Shuo Wang
Xiaolong Wang
Peng Li
Maosong Sun
Yang Liu
LRM
122
15
0
19 Feb 2024
Amplifying Training Data Exposure through Fine-Tuning with
  Pseudo-Labeled Memberships
Amplifying Training Data Exposure through Fine-Tuning with Pseudo-Labeled Memberships
Myung Gyo Oh
Hong Eun Ahn
L. Park
T.-H. Kwon
MIALMAAML
87
0
0
19 Feb 2024
BIDER: Bridging Knowledge Inconsistency for Efficient
  Retrieval-Augmented LLMs via Key Supporting Evidence
BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence
Jiajie Jin
Yutao Zhu
Yujia Zhou
Zhicheng Dou
RALM
104
23
0
19 Feb 2024
Transformer-based Causal Language Models Perform Clustering
Transformer-based Causal Language Models Perform Clustering
Xinbo Wu
Lav Varshney
78
6
0
19 Feb 2024
Your Large Language Model is Secretly a Fairness Proponent and You
  Should Prompt it Like One
Your Large Language Model is Secretly a Fairness Proponent and You Should Prompt it Like One
Tianlin Li
Xiaoyu Zhang
Chao Du
Tianyu Pang
Qian Liu
Qing Guo
Chao Shen
Yang Liu
ALM
76
11
0
19 Feb 2024
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Zijun Liu
Boqun Kou
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
108
3
0
19 Feb 2024
Do Large Language Models Understand Logic or Just Mimick Context?
Do Large Language Models Understand Logic or Just Mimick Context?
Junbing Yan
Chengyu Wang
Junyuan Huang
Wei Zhang
ReLMELMLRM
70
7
0
19 Feb 2024
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language
  Models Gains More
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More
Yuxuan Yue
Zhihang Yuan
Haojie Duanmu
Sifan Zhou
Jianlong Wu
Liqiang Nie
MQ
93
49
0
19 Feb 2024
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When
  and What to Retrieve for LLMs
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs
Jiejun Tan
Zhicheng Dou
Yutao Zhu
Peidong Guo
Kun Fang
Ji-Rong Wen
131
30
0
19 Feb 2024
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large
  Language Models
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
Didi Zhu
Zhongyi Sun
Zexi Li
Tao Shen
Ke Yan
Shouhong Ding
Kun Kuang
Chao Wu
CLLKELMMoMe
131
31
0
19 Feb 2024
CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI
  Automation
CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI Automation
Xinbei Ma
Zhuosheng Zhang
Hai Zhao
LLMAG
107
34
0
19 Feb 2024
Direct Large Language Model Alignment Through Self-Rewarding Contrastive
  Prompt Distillation
Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation
Aiwei Liu
Haoping Bai
Zhiyun Lu
Xiang Kong
Simon Wang
Jiulong Shan
Mengsi Cao
Lijie Wen
ALM
74
13
0
19 Feb 2024
Learning to Edit: Aligning LLMs with Knowledge Editing
Learning to Edit: Aligning LLMs with Knowledge Editing
Yuxin Jiang
Yufei Wang
Chuhan Wu
Wanjun Zhong
Xingshan Zeng
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Qun Liu
Wei Wang
KELM
96
30
0
19 Feb 2024
SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning
SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning
Zhihao Wen
Jie Zhang
Yuan Fang
MoE
82
3
0
19 Feb 2024
ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large
  Language Models with Reverse Prompt Contrastive Decoding
ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
LM&MA
100
23
0
19 Feb 2024
The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional
  Supporters for Queer Youth
The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional Supporters for Queer Youth
Shir Lissak
Nitay Calderon
Geva Shenkman
Yaakov Ophir
Eyal Fruchter
A. Klomek
Roi Reichart
AI4MH
121
16
0
19 Feb 2024
An Empirical Evaluation of LLMs for Solving Offensive Security
  Challenges
An Empirical Evaluation of LLMs for Solving Offensive Security Challenges
Minghao Shao
Boyuan Chen
Sofija Jancheska
Brendan Dolan-Gavitt
Siddharth Garg
Ramesh Karri
Mohamed Bennai
104
30
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference
  Dataset and Modular Fine-tuning Schema
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
129
2
0
19 Feb 2024
LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge
  Graphs
LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs
Kai Wang
Yuwei Xu
Zhiyong Wu
Siqiang Luo
LRM
118
10
0
19 Feb 2024
Unveiling the Magic: Investigating Attention Distillation in
  Retrieval-augmented Generation
Unveiling the Magic: Investigating Attention Distillation in Retrieval-augmented Generation
Zizhong Li
Haopeng Zhang
Jiawei Zhang
RALM
94
1
0
19 Feb 2024
SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot
  Interaction
SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot Interaction
Jie Xu
Hanbo Zhang
Xinghang Li
Huaping Liu
Xuguang Lan
Tao Kong
LM&Ro
95
3
0
19 Feb 2024
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Fengqing Jiang
Zhangchen Xu
Luyao Niu
Zhen Xiang
Bhaskar Ramasubramanian
Bo Li
Radha Poovendran
152
109
0
19 Feb 2024
Machine-Generated Text Localization
Machine-Generated Text Localization
Zhongping Zhang
Wenda Qin
Bryan A. Plummer
DeLMO
67
6
0
19 Feb 2024
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation
Chanwoong Yoon
Gangwoo Kim
Byeongguk Jeon
Sungdong Kim
Yohan Jo
Jaewoo Kang
KELMRALM
137
0
0
19 Feb 2024
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles
Oleksandr Balabanov
Hampus Linander
UQCV
119
20
0
19 Feb 2024
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
Renqiu Xia
Bo Zhang
Hancheng Ye
Xiangchao Yan
Qi Liu
...
Min Dou
Botian Shi
Junchi Yan
Junchi Yan
Yu Qiao
LRM
187
68
0
19 Feb 2024
An Empirical Categorization of Prompting Techniques for Large Language
  Models: A Practitioner's Guide
An Empirical Categorization of Prompting Techniques for Large Language Models: A Practitioner's Guide
Oluwole Fagbohun
Rachel M. Harrison
Anton Dereventsov
131
9
0
18 Feb 2024
How Susceptible are Large Language Models to Ideological Manipulation?
How Susceptible are Large Language Models to Ideological Manipulation?
Kai Chen
Zihao He
Jun Yan
Taiwei Shi
Kristina Lerman
135
14
0
18 Feb 2024
On the Roles of LLMs in Planning: Embedding LLMs into Planning Graphs
On the Roles of LLMs in Planning: Embedding LLMs into Planning Graphs
H. Zhuo
Xin Chen
Rong Pan
59
2
0
18 Feb 2024
Logical Closed Loop: Uncovering Object Hallucinations in Large
  Vision-Language Models
Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models
Jun Wu
Qiang Liu
Ding Wang
Jinghao Zhang
Shu Wu
Liang Wang
Tien-Ping Tan
LRM
95
24
0
18 Feb 2024
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
S. Hayati
Taehee Jung
Tristan Bodding-Long
Sudipta Kar
A. Sethy
Joo-Kyung Kim
Dongyeop Kang
ALMLRM
140
9
0
18 Feb 2024
Advancing Translation Preference Modeling with RLHF: A Step Towards
  Cost-Effective Solution
Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution
Nuo Xu
Jun Zhao
Can Zu
Sixian Li
Lu Chen
...
Shihan Dou
Wenjuan Qin
Tao Gui
Qi Zhang
Xuanjing Huang
88
7
0
18 Feb 2024
Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM
Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM
Zijin Hong
Zheng Yuan
Hao Chen
Qinggang Zhang
Feiran Huang
Xiao Huang
124
32
0
18 Feb 2024
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs'
  Overconfidence Helps Retrieval Augmentation
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation
Shiyu Ni
Keping Bi
Jiafeng Guo
Xueqi Cheng
RALM
73
37
0
18 Feb 2024
Learning to Learn Faster from Human Feedback with Language Model
  Predictive Control
Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Jacky Liang
Fei Xia
Wenhao Yu
Andy Zeng
Montse Gonzalez Arenas
...
N. Heess
Kanishka Rao
Nik Stewart
Jie Tan
Carolina Parada
LM&Ro
127
35
0
18 Feb 2024
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and
  Improving LLMs
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
Siyuan Wang
Zhongyu Wei
Yejin Choi
Xiang Ren
ReLMELMLRM
45
24
0
18 Feb 2024
EventRL: Enhancing Event Extraction with Outcome Supervision for Large
  Language Models
EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models
Jun Gao
Huan Zhao
Wei Wang
Changlong Yu
Ruifeng Xu
OffRL
63
5
0
18 Feb 2024
Don't Go To Extremes: Revealing the Excessive Sensitivity and
  Calibration Limitations of LLMs in Implicit Hate Speech Detection
Don't Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection
Min Zhang
Jianfeng He
Taoran Ji
Chang-Tien Lu
64
15
0
18 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
155
4
0
17 Feb 2024
Dissecting Human and LLM Preferences
Dissecting Human and LLM Preferences
Junlong Li
Fan Zhou
Shichao Sun
Yikai Zhang
Hai Zhao
Pengfei Liu
ALM
89
6
0
17 Feb 2024
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
Shu Yang
Muhammad Asif Ali
Cheng-Long Wang
Lijie Hu
Di Wang
CLLMoE
114
46
0
17 Feb 2024
Aligning Large Language Models by On-Policy Self-Judgment
Aligning Large Language Models by On-Policy Self-Judgment
Sangkyu Lee
Sungdong Kim
Ashkan Yousefpour
Minjoon Seo
Kang Min Yoo
Youngjae Yu
OSLM
75
12
0
17 Feb 2024
CoLLaVO: Crayon Large Language and Vision mOdel
CoLLaVO: Crayon Large Language and Vision mOdel
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
VLMMLLM
117
18
0
17 Feb 2024
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based
  Agents
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
Wenkai Yang
Xiaohan Bi
Yankai Lin
Sishuo Chen
Jie Zhou
Xu Sun
LLMAGAAML
135
71
0
17 Feb 2024
Exploring ChatGPT for Next-generation Information Retrieval:
  Opportunities and Challenges
Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges
Yizheng Huang
Jimmy X. Huang
120
11
0
17 Feb 2024
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Yougang Lyu
Lingyong Yan
Shuaiqiang Wang
Haibo Shi
D. Yin
Fajie Yuan
Zhumin Chen
Maarten de Rijke
Zhaochun Ren
82
7
0
17 Feb 2024
GenDec: A robust generative Question-decomposition method for Multi-hop
  reasoning
GenDec: A robust generative Question-decomposition method for Multi-hop reasoning
Jian Wu
Linyi Yang
Yuliang Ji
Wenhao Huang
Börje F. Karlsson
Manabu Okumura
56
6
0
17 Feb 2024
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot
  Relation Extraction
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction
Sizhe Zhou
Yu Meng
Bowen Jin
Jiawei Han
71
11
0
17 Feb 2024
Previous
123...969798...126127128
Next