ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,380 papers shown
Title
Are Large Language Models Good Statisticians?
Are Large Language Models Good Statisticians?
Yizhang Zhu
Shiyin Du
Boyan Li
Yuyu Luo
Nan Tang
ELM
85
18
0
12 Jun 2024
Collective Constitutional AI: Aligning a Language Model with Public
  Input
Collective Constitutional AI: Aligning a Language Model with Public Input
Saffron Huang
Divya Siddarth
Liane Lovitt
Thomas I. Liao
Esin Durmus
Alex Tamkin
Deep Ganguli
ELM
137
83
0
12 Jun 2024
Prompt-Based Length Controlled Generation with Multiple Control Types
Prompt-Based Length Controlled Generation with Multiple Control Types
Renlong Jie
Xiaojun Meng
Lifeng Shang
Xin Jiang
Qun Liu
91
8
0
12 Jun 2024
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Zijin Hong
Zheng Yuan
Qinggang Zhang
Hao Chen
Junnan Dong
Feiran Huang
Xiao Huang
197
74
0
12 Jun 2024
UICoder: Finetuning Large Language Models to Generate User Interface
  Code through Automated Feedback
UICoder: Finetuning Large Language Models to Generate User Interface Code through Automated Feedback
Jason Wu
E. Schoop
Alan Leung
Titus Barik
Jeffrey P. Bigham
Jeffrey Nichols
68
14
0
11 Jun 2024
REAL Sampling: Boosting Factuality and Diversity of Open-Ended
  Generation via Asymptotic Entropy
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
HILM
82
3
0
11 Jun 2024
OPTune: Efficient Online Preference Tuning
OPTune: Efficient Online Preference Tuning
Lichang Chen
Jiuhai Chen
Chenxi Liu
John Kirchenbauer
Davit Soselia
Chen Zhu
Tom Goldstein
Dinesh Manocha
Heng Huang
70
5
0
11 Jun 2024
Image Textualization: An Automatic Framework for Creating Accurate and
  Detailed Image Descriptions
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions
Renjie Pi
Jianshu Zhang
Jipeng Zhang
Boyao Wang
Zhekai Chen
Tong Zhang
3DV
87
24
0
11 Jun 2024
TextGrad: Automatic "Differentiation" via Text
TextGrad: Automatic "Differentiation" via Text
Mert Yuksekgonul
Federico Bianchi
Joseph Boen
Sheng Liu
Zhi Huang
Carlos Guestrin
James Zou
LLMAGOODAI4CE
108
48
0
11 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding
Paraphrasing in Affirmative Terms Improves Negation Understanding
MohammadHossein Rezaei
Eduardo Blanco
79
2
0
11 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
137
2
0
11 Jun 2024
DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented
  Generation for Question-Answering
DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering
Zijian Hei
Weiling Liu
Wenjie Ou
Juyi Qiao
Junming Jiao
Guowen Song
Ting Tian
Yi Lin
RALM
114
7
0
11 Jun 2024
Multi-objective Reinforcement learning from AI Feedback
Multi-objective Reinforcement learning from AI Feedback
Marcus Williams
97
1
0
11 Jun 2024
DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation
  through Dual Learning Feedback Mechanisms
DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms
Andong Chen
Lianzhang Lou
Kehai Chen
Xuefeng Bai
Yang Xiang
Muyun Yang
Tiejun Zhao
Min Zhang
VLM
117
13
0
11 Jun 2024
CoEvol: Constructing Better Responses for Instruction Finetuning through
  Multi-Agent Cooperation
CoEvol: Constructing Better Responses for Instruction Finetuning through Multi-Agent Cooperation
Renhao Li
Minghuan Tan
Derek F. Wong
Min Yang
LLMAG
68
1
0
11 Jun 2024
FoodSky: A Food-oriented Large Language Model that Passes the Chef and
  Dietetic Examination
FoodSky: A Food-oriented Large Language Model that Passes the Chef and Dietetic Examination
Pengfei Zhou
Weiqing Min
Chaoran Fu
Ying Jin
Mingyu Huang
Xiangyang Li
Shuhuan Mei
Shuqiang Jiang
90
10
0
11 Jun 2024
Scaling Large Language Model-based Multi-Agent Collaboration
Scaling Large Language Model-based Multi-Agent Collaboration
Chen Qian
Zihao Xie
YiFei Wang
Wei Liu
Yufan Dang
...
Zhuoyun Du
Weize Chen
Cheng Yang
Zhiyuan Liu
Maosong Sun
AI4CELLMAGLM&Ro
178
78
0
11 Jun 2024
Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees
Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees
Sijia Chen
Yibo Wang
Yi-Feng Wu
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
Lijun Zhang
LLMAGLRM
123
18
0
11 Jun 2024
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
Yuzi Yan
Yibo Miao
J. Li
Yipin Zhang
Jian Xie
Zhijie Deng
Dong Yan
109
13
0
11 Jun 2024
Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Xuefeng Liu
Chih-chan Tien
Peng Ding
Songhao Jiang
Rick L. Stevens
144
5
0
11 Jun 2024
Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications
Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications
Junlin Wang
Tianyi Yang
Roy Xie
Bhuwan Dhingra
SILMAAML
78
5
0
10 Jun 2024
TRINS: Towards Multimodal Language Models that Can Read
TRINS: Towards Multimodal Language Models that Can Read
Ruiyi Zhang
Yanzhe Zhang
Jian Chen
Yufan Zhou
Jiuxiang Gu
Changyou Chen
Tong Sun
VLM
82
6
0
10 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
136
301
0
10 Jun 2024
Multimodal Contextualized Semantic Parsing from Speech
Multimodal Contextualized Semantic Parsing from Speech
Jordan Voas
Raymond Mooney
David Harwath
77
0
0
10 Jun 2024
Language Models are Alignable Decision-Makers: Dataset and Application
  to the Medical Triage Domain
Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain
Brian Hu
Bill Ray
Alice Leung
Amy Summerville
David Joy
Christopher Funk
Arslan Basharat
86
6
0
10 Jun 2024
Margin-aware Preference Optimization for Aligning Diffusion Models
  without Reference
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong
Sayak Paul
Noah Lee
Kashif Rasul
James Thorne
Jongheon Jeong
106
18
0
10 Jun 2024
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific
  Workflows
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows
Xingjian Zhang
Yutong Xie
Jin Huang
Jinge Ma
Zhaoying Pan
...
Ziyang Xiong
Tolga Ergen
Dongsub Shim
Honglak Lee
Qiaozhu Mei
107
12
0
10 Jun 2024
Tx-LLM: A Large Language Model for Therapeutics
Tx-LLM: A Large Language Model for Therapeutics
Juan Manuel Zambrano Chaves
Eric Wang
Tao Tu
E. D. Vaishnav
Byron Lee
S. S. Mahdavi
Christopher Semturs
David Fleet
Vivek Natarajan
Shekoofeh Azizi
LM&MA
125
20
0
10 Jun 2024
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak
  Attacks
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak Attacks
Zonghao Ying
Aishan Liu
Xianglong Liu
Dacheng Tao
124
25
0
10 Jun 2024
Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog
  Systems
Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems
Christos Vlachos
Themos Stafylakis
Ion Androutsopoulos
94
1
0
10 Jun 2024
Prompting Large Language Models with Audio for General-Purpose Speech
  Summarization
Prompting Large Language Models with Audio for General-Purpose Speech Summarization
Wonjune Kang
Deb Roy
LRM
71
7
0
10 Jun 2024
Aligning Large Language Models with Representation Editing: A Control
  Perspective
Aligning Large Language Models with Representation Editing: A Control Perspective
Lingkai Kong
Haorui Wang
Wenhao Mu
Yuanqi Du
Yuchen Zhuang
Yifei Zhou
Yue Song
Rongzhi Zhang
Kai Wang
Chao Zhang
98
26
0
10 Jun 2024
Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Xiangyu Qi
Ashwinee Panda
Kaifeng Lyu
Xiao Ma
Subhrajit Roy
Ahmad Beirami
Prateek Mittal
Peter Henderson
120
142
0
10 Jun 2024
MolX: Enhancing Large Language Models for Molecular Learning with A Multi-Modal Extension
MolX: Enhancing Large Language Models for Molecular Learning with A Multi-Modal Extension
Khiem Le
Zhichun Guo
Kaiwen Dong
Xiaobao Huang
B. Nan
Roshni G. Iyer
Xiangliang Zhang
Olaf Wiest
Wei Wang
Nitesh Chawla
116
0
0
10 Jun 2024
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Yipeng Zhang
Haitao Mi
Helen Meng
CLLKELM
164
8
0
10 Jun 2024
Information Theoretic Guarantees For Policy Alignment In Large Language
  Models
Information Theoretic Guarantees For Policy Alignment In Large Language Models
Youssef Mroueh
97
8
0
09 Jun 2024
Distributional Preference Alignment of LLMs via Optimal Transport
Distributional Preference Alignment of LLMs via Optimal Transport
Igor Melnyk
Youssef Mroueh
Brian M. Belgodere
Mattia Rigotti
Apoorva Nitsure
Mikhail Yurochkin
Kristjan Greenewald
Jirí Navrátil
Jerret Ross
112
13
0
09 Jun 2024
Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based
  Interactions
Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions
Cheng Tan
Dongxin Lyu
Siyuan Li
Zhangyang Gao
Jingxuan Wei
Siqi Ma
Zicheng Liu
Stan Z. Li
LLMAG
81
13
0
09 Jun 2024
A Superalignment Framework in Autonomous Driving with Large Language
  Models
A Superalignment Framework in Autonomous Driving with Large Language Models
Xiangrui Kong
Thomas Braunl
Marco Fahmi
Yue Wang
82
9
0
09 Jun 2024
How Alignment and Jailbreak Work: Explain LLM Safety through
  Intermediate Hidden States
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Yongbin Li
121
41
0
09 Jun 2024
A Reality check of the benefits of LLM in business
A Reality check of the benefits of LLM in business
Ming Cheung
92
4
0
09 Jun 2024
F-LMM: Grounding Frozen Large Multimodal Models
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu
Sheng Jin
Wenwei Zhang
Lumin Xu
Wentao Liu
Wei Li
Chen Change Loy
MLLM
199
15
0
09 Jun 2024
On the Worst Prompt Performance of Large Language Models
On the Worst Prompt Performance of Large Language Models
Bowen Cao
Deng Cai
Zhisong Zhang
Yuexian Zou
Wai Lam
ALMLRM
99
8
0
08 Jun 2024
Deconstructing The Ethics of Large Language Models from Long-standing
  Issues to New-emerging Dilemmas
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
Chengyuan Deng
Yiqun Duan
Xin Jin
Heng Chang
Yijun Tian
...
Kuofeng Gao
Sihong He
Jun Zhuang
Lu Cheng
Haohan Wang
AILaw
92
24
0
08 Jun 2024
Planning Like Human: A Dual-process Framework for Dialogue Planning
Planning Like Human: A Dual-process Framework for Dialogue Planning
Tao He
Lizi Liao
Yixin Cao
Yuanxing Liu
Ming Liu
Zerui Chen
Bing Qin
117
20
0
08 Jun 2024
CaLM: Contrasting Large and Small Language Models to Verify Grounded
  Generation
CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation
I-Hung Hsu
Zifeng Wang
Long T. Le
Lesly Miculicich
Nanyun Peng
Chen-Yu Lee
Tomas Pfister
HILM
112
4
0
08 Jun 2024
Flexible and Adaptable Summarization via Expertise Separation
Flexible and Adaptable Summarization via Expertise Separation
Preslav Nakov
Mingzhe Li
Shen Gao
Xin Cheng
Qingqing Zhu
Rui Yan
Xin Gao
Xiangliang Zhang
MoE
75
5
0
08 Jun 2024
Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from
  Imperfect Teacher Models in Low-Budget Scenarios
Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios
Yuhang Zhou
Wei Ai
96
7
0
08 Jun 2024
ChatSR: Multimodal Large Language Models for Scientific Formula Discovery
ChatSR: Multimodal Large Language Models for Scientific Formula Discovery
Yanjie Li
Weijun Li
Lina Yu
Min Wu
Jingyi Liu
Wenqiang Li
Shu Wei
Yusong Deng
OffRL
98
3
0
08 Jun 2024
SUMIE: A Synthetic Benchmark for Incremental Entity Summarization
SUMIE: A Synthetic Benchmark for Incremental Entity Summarization
Eunjeong Hwang
Yichao Zhou
Beliz Gunel
James Bradley Wendt
Sandeep Tata
RALM
66
3
0
07 Jun 2024
Previous
123...686970...126127128
Next