ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,392 papers shown
Title
Mixed Distillation Helps Smaller Language Model Better Reasoning
Mixed Distillation Helps Smaller Language Model Better Reasoning
Chenglin Li
Qianglong Chen
Liangyue Li
Wang Caiyu
Yicheng Li
Zhang Yin
Yin Zhang
LRM
81
15
0
17 Dec 2023
Silkie: Preference Distillation for Large Visual Language Models
Silkie: Preference Distillation for Large Visual Language Models
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
Liang Chen
Yazheng Yang
Benyou Wang
Lingpeng Kong
MLLM
190
80
0
17 Dec 2023
Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question
  Answering and Summarization
Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question Answering and Summarization
Do Xuan Long
Mohammad Hassanpour
Ahmed Masry
P. Kavehzadeh
Enamul Hoque
Shafiq Joty
LRM
47
10
0
17 Dec 2023
Policy Optimization in RLHF: The Impact of Out-of-preference Data
Policy Optimization in RLHF: The Impact of Out-of-preference Data
Ziniu Li
Tian Xu
Yang Yu
107
34
0
17 Dec 2023
When Parameter-efficient Tuning Meets General-purpose Vision-language
  Models
When Parameter-efficient Tuning Meets General-purpose Vision-language Models
Yihang Zhai
Haixin Wang
Jianlong Chang
Xinlong Yang
Jinan Sun
Shikun Zhang
Qi Tian
VLMMLLM
66
1
0
16 Dec 2023
Let AI Entertain You: Increasing User Engagement with Generative AI and
  Rejection Sampling
Let AI Entertain You: Increasing User Engagement with Generative AI and Rejection Sampling
Jingying Zeng
Jaewon Yang
Waleed Malik
Xiao Yan
Richard Huang
Qi He
54
1
0
16 Dec 2023
ProTIP: Progressive Tool Retrieval Improves Planning
ProTIP: Progressive Tool Retrieval Improves Planning
R. Anantha
Bortik Bandyopadhyay
Anirudh Kashi
Sayantan Mahinder
Andrew W Hill
Srinivas Chappidi
53
6
0
16 Dec 2023
One-Shot Learning as Instruction Data Prospector for Large Language
  Models
One-Shot Learning as Instruction Data Prospector for Large Language Models
Yunshui Li
Binyuan Hui
Xiaobo Xia
Jiaxi Yang
Min Yang
...
Ling-Hao Chen
Junhao Liu
Tongliang Liu
Fei Huang
Yongbin Li
109
36
0
16 Dec 2023
Low-resource classification of mobility functioning information in
  clinical sentences using large language models
Low-resource classification of mobility functioning information in clinical sentences using large language models
Tuan-Dung Le
Thanh Duong
Thanh Thieu
53
0
0
15 Dec 2023
Faithful Persona-based Conversational Dataset Generation with Large
  Language Models
Faithful Persona-based Conversational Dataset Generation with Large Language Models
Pegah Jandaghi
XiangHai Sheng
Xinyi Bai
Jay Pujara
Hakim Sidahmed
99
24
0
15 Dec 2023
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models
  via MoE-Style Plugin
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
Shihan Dou
Enyu Zhou
Yan Liu
Songyang Gao
Jun Zhao
...
Jiang Zhu
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
CLLMoEKELM
83
36
0
15 Dec 2023
SMILE: Multimodal Dataset for Understanding Laughter in Video with
  Language Models
SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models
Lee Hyun
Kim Sung-Bin
Seungju Han
Youngjae Yu
Tae-Hyun Oh
100
15
0
15 Dec 2023
Silent Guardian: Protecting Text from Malicious Exploitation by Large
  Language Models
Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models
Jiawei Zhao
Kejiang Chen
Xianjian Yuan
Yuang Qi
Weiming Zhang
Neng H. Yu
97
9
0
15 Dec 2023
Riveter: Measuring Power and Social Dynamics Between Entities
Riveter: Measuring Power and Social Dynamics Between Entities
Maria Antoniak
Anjalie Field
Jimin Mun
Melanie Walsh
Lauren F. Klein
Maarten Sap
58
9
0
15 Dec 2023
GSVA: Generalized Segmentation via Multimodal Large Language Models
GSVA: Generalized Segmentation via Multimodal Large Language Models
Zhuofan Xia
Dongchen Han
Yizeng Han
Xuran Pan
Shiji Song
Gao Huang
VLM
152
68
0
15 Dec 2023
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak
  Supervision
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Collin Burns
Pavel Izmailov
Jan Hendrik Kirchner
Bowen Baker
Leo Gao
...
Adrien Ecoffet
Manas Joglekar
Jan Leike
Ilya Sutskever
Jeff Wu
ELM
143
299
0
14 Dec 2023
Promptable Behaviors: Personalizing Multi-Objective Rewards from Human
  Preferences
Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences
Minyoung Hwang
Luca Weihs
Chanwoo Park
Kimin Lee
Aniruddha Kembhavi
Kiana Ehsani
90
20
0
14 Dec 2023
Self-Evaluation Improves Selective Generation in Large Language Models
Self-Evaluation Improves Selective Generation in Large Language Models
Jie Jessie Ren
Yao-Min Zhao
Tu Vu
Peter J. Liu
Balaji Lakshminarayanan
ELM
97
41
0
14 Dec 2023
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral
  Planning States for Autonomous Driving
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Wenhai Wang
Jiangwei Xie
ChuanYang Hu
Haoming Zou
Jianan Fan
...
Lewei Lu
Xizhou Zhu
Xiaogang Wang
Yu Qiao
Jifeng Dai
106
146
0
14 Dec 2023
The Earth is Flat because...: Investigating LLMs' Belief towards
  Misinformation via Persuasive Conversation
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
Rongwu Xu
Brian S. Lin
Shujian Yang
Tianqi Zhang
Weiyan Shi
Tianwei Zhang
Zhixuan Fang
Wei Xu
Han Qiu
174
61
0
14 Dec 2023
Towards Verifiable Text Generation with Evolving Memory and
  Self-Reflection
Towards Verifiable Text Generation with Evolving Memory and Self-Reflection
Hao Sun
Hengyi Cai
Bo Wang
Yingyan Hou
Xiaochi Wei
Shuaiqiang Wang
Yan Zhang
D. Yin
148
10
0
14 Dec 2023
TAP4LLM: Table Provider on Sampling, Augmenting, and Packing
  Semi-structured Data for Large Language Model Reasoning
TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning
Yuan Sui
Jiaru Zou
Mengyu Zhou
Xinyi He
Lun Du
Shi Han
Dongmei Zhang
LRMLMTD
64
25
0
14 Dec 2023
Depicting Beyond Scores: Advancing Image Quality Assessment through
  Multi-modal Language Models
Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models
Zhiyuan You
Zheyuan Li
Jinjin Gu
Zhenfei Yin
Tianfan Xue
Chao Dong
EGVM
83
42
0
14 Dec 2023
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human
  Annotations
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Peiyi Wang
Lei Li
Zhihong Shao
R. X. Xu
Damai Dai
Yifei Li
Deli Chen
Y.Wu
Zhifang Sui
AIMatLRMALM
205
398
0
14 Dec 2023
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
Haoran Liao
Qinyi Du
Shaohua Hu
Hao He
Yanyan Xu
Jidong Tian
Yaohui Jin
LRMAI4CE
65
1
0
14 Dec 2023
Evaluating Large Language Models for Health-related Queries with
  Presuppositions
Evaluating Large Language Models for Health-related Queries with Presuppositions
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILMELM
81
4
0
14 Dec 2023
TigerBot: An Open Multilingual Multitask LLM
TigerBot: An Open Multilingual Multitask LLM
Ye Chen
Wei Cai
Liangming Wu
Xiaowei Li
Zhanxuan Xin
Cong Fu
271
11
0
14 Dec 2023
Future-proofing geotechnics workflows: accelerating problem-solving with
  large language models
Future-proofing geotechnics workflows: accelerating problem-solving with large language models
Stephen Wu
Yu Otake
Daijiro Mizutani
Chang Liu
Kotaro Asano
...
Taiga Saito
Akihiro Shioi
M. Takenobu
Keigo Tsukioka
Ryo Yoshikawa
AI4CE
34
13
0
14 Dec 2023
Distributional Preference Learning: Understanding and Accounting for
  Hidden Context in RLHF
Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF
Anand Siththaranjan
Cassidy Laidlaw
Dylan Hadfield-Menell
134
72
0
13 Dec 2023
Beyond English: Evaluating LLMs for Arabic Grammatical Error Correction
Beyond English: Evaluating LLMs for Arabic Grammatical Error Correction
S. Kwon
Gagan Bhatia
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
105
18
0
13 Dec 2023
Graph vs. Sequence: An Empirical Study on Knowledge Forms for
  Knowledge-Grounded Dialogue
Graph vs. Sequence: An Empirical Study on Knowledge Forms for Knowledge-Grounded Dialogue
Yizhe Yang
Heyan Huang
Yuhang Liu
Yang Gao
63
1
0
13 Dec 2023
An Invitation to Deep Reinforcement Learning
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRLOOD
197
5
0
13 Dec 2023
Exploring Large Language Models to Facilitate Variable Autonomy for
  Human-Robot Teaming
Exploring Large Language Models to Facilitate Variable Autonomy for Human-Robot Teaming
Younes Lakhnati
Max Pascher
Jens Gerken
LLMAGLM&Ro
76
4
0
12 Dec 2023
LLMs Perform Poorly at Concept Extraction in Cyber-security Research
  Literature
LLMs Perform Poorly at Concept Extraction in Cyber-security Research Literature
Maxime Wursch
Andrei Kucharavy
Dimitri Percia David
Alain Mermoud
78
6
0
12 Dec 2023
Rethinking the Instruction Quality: LIFT is What You Need
Rethinking the Instruction Quality: LIFT is What You Need
Yang Xu
Yongqiang Yao
Yufan Huang
Mengnan Qi
Maoquan Wang
Bin Gu
Neel Sundaresan
ALM
83
35
0
12 Dec 2023
Multimodality of AI for Education: Towards Artificial General
  Intelligence
Multimodality of AI for Education: Towards Artificial General Intelligence
Gyeong-Geon Lee
Lehong Shi
Ehsan Latif
Yizhu Gao
Arne Bewersdorff
...
Zheng Liu
Hui Wang
Gengchen Mai
Tiaming Liu
Xiaoming Zhai
112
43
0
10 Dec 2023
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
O. Ovadia
Menachem Brief
Moshik Mishaeli
Oren Elisha
RALM
116
153
0
10 Dec 2023
Neither hype nor gloom do DNNs justice
Neither hype nor gloom do DNNs justice
Gaurav Malhotra
Christian Tsvetkov
B. D. Evans
92
125
0
08 Dec 2023
FREDSum: A Dialogue Summarization Corpus for French Political Debates
FREDSum: A Dialogue Summarization Corpus for French Political Debates
Virgile Rennard
Guokan Shang
Damien Grari
Julie Hunter
Michalis Vazirgiannis
75
3
0
08 Dec 2023
MEVG: Multi-event Video Generation with Text-to-Video Models
MEVG: Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh
Jaehwan Jeong
Sieun Kim
Wonmin Byeon
Jinkyu Kim
Sungwoong Kim
Sangpil Kim
VGenDiffM
117
23
0
07 Dec 2023
Mitigating Open-Vocabulary Caption Hallucinations
Mitigating Open-Vocabulary Caption Hallucinations
Assaf Ben-Kish
Moran Yanuka
Morris Alper
Raja Giryes
Hadar Averbuch-Elor
MLLMVLM
123
6
0
06 Dec 2023
UFineBench: Towards Text-based Person Retrieval with Ultra-fine
  Granularity
UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity
Jia-li Zuo
Hanyu Zhou
Ying Nie
Feng Zhang
Tianyu Guo
Nong Sang
Yunhe Wang
Changxin Gao
141
23
0
06 Dec 2023
ULMA: Unified Language Model Alignment with Human Demonstration and
  Point-wise Preference
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
Tianchi Cai
Xierui Song
Jiyan Jiang
Fei Teng
Jinjie Gu
Guannan Zhang
ALM
94
5
0
05 Dec 2023
Visual Hindsight Self-Imitation Learning for Interactive Navigation
Visual Hindsight Self-Imitation Learning for Interactive Navigation
Kibeom Kim
Kisung Shin
Min Whoo Lee
Moonhoen Lee
Minsu Lee
Byoung-Tak Zhang
94
2
0
05 Dec 2023
Breast Ultrasound Report Generation using LangChain
Breast Ultrasound Report Generation using LangChain
Jaeyoung Huh
HyunWook Park
Jong Chul Ye
36
6
0
05 Dec 2023
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
Zhangyang Qi
Ye Fang
Zeyi Sun
Xiaoyang Wu
Tong Wu
Jiaqi Wang
Dahua Lin
Hengshuang Zhao
MLLM
184
42
0
05 Dec 2023
Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images
Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images
Zhuoran Yu
Chenchen Zhu
Sean Culatana
Raghuraman Krishnamoorthi
Fanyi Xiao
Yong Jae Lee
182
15
0
04 Dec 2023
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Francis Rhys Ward
Francesco Belardinelli
Francesca Toni
Tom Everitt
184
31
0
03 Dec 2023
RLHF and IIA: Perverse Incentives
RLHF and IIA: Perverse Incentives
Wanqiao Xu
Shi Dong
Xiuyuan Lu
Grace Lam
Zheng Wen
Benjamin Van Roy
86
2
0
02 Dec 2023
CLIP-QDA: An Explainable Concept Bottleneck Model
CLIP-QDA: An Explainable Concept Bottleneck Model
Rémi Kazmierczak
Eloise Berthier
Goran Frehse
Gianni Franchi
101
7
0
30 Nov 2023
Previous
123...109110111...126127128
Next