ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,395 papers shown
Title
Can GPT-3.5 Generate and Code Discharge Summaries?
Can GPT-3.5 Generate and Code Discharge Summaries?
Matúvs Falis
Aryo Pradipta Gema
Hang Dong
Luke Daines
Siddharth Basetti
Michael Holder
Rose S Penfold
Alexandra Birch
Beatrice Alex
MedIm
98
14
0
24 Jan 2024
ChatterBox: Multi-round Multimodal Referring and Grounding
ChatterBox: Multi-round Multimodal Referring and Grounding
Yunjie Tian
Tianren Ma
Lingxi Xie
Jihao Qiu
Xi Tang
Yuan Zhang
Jianbin Jiao
Qi Tian
Qixiang Ye
80
14
0
24 Jan 2024
Towards Explainable Harmful Meme Detection through Multimodal Debate
  between Large Language Models
Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models
Hongzhan Lin
Ziyang Luo
Wei Gao
Jing Ma
Bo Wang
Ruichao Yang
66
16
0
24 Jan 2024
Can AI Assistants Know What They Don't Know?
Can AI Assistants Know What They Don't Know?
Qinyuan Cheng
Tianxiang Sun
Xiangyang Liu
Wenwei Zhang
Zhangyue Yin
Shimin Li
Linyang Li
Zhengfu He
Kai Chen
Xipeng Qiu
113
27
0
24 Jan 2024
UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for
  Personalized Dialogue Systems
UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems
Hongru Wang
Wenyu Huang
Yang Deng
Rui Wang
Zezhong Wang
Yufei Wang
Fei Mi
Jeff Z. Pan
Kam-Fai Wong
RALM
114
33
0
24 Jan 2024
Adaptive Crowdsourcing Via Self-Supervised Learning
Adaptive Crowdsourcing Via Self-Supervised Learning
Anmol Kagrecha
Henrik Marklund
Benjamin Van Roy
Hong Jun Jeon
Richard Zeckhauser
FedML
92
0
0
24 Jan 2024
TAT-LLM: A Specialized Language Model for Discrete Reasoning over
  Tabular and Textual Data
TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data
Fengbin Zhu
Ziyang Liu
Fuli Feng
Chao Wang
Moxin Li
Tat-Seng Chua
LMTDLRM
55
17
0
24 Jan 2024
Language-Guided World Models: A Model-Based Approach to AI Control
Language-Guided World Models: A Model-Based Approach to AI Control
Alex Zhang
Khanh Nguyen
Jens Tuyls
Albert Lin
Karthik Narasimhan
LLMAG
90
7
0
24 Jan 2024
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents
Chang Ma
Junlei Zhang
Zhihao Zhu
Cheng Yang
Yujiu Yang
Yaohui Jin
Zhenzhong Lan
Lingpeng Kong
Junxian He
ELMLLMAG
86
75
0
24 Jan 2024
ARGS: Alignment as Reward-Guided Search
ARGS: Alignment as Reward-Guided Search
Maxim Khanov
Jirayu Burapacheep
Yixuan Li
130
62
0
23 Jan 2024
The Language Barrier: Dissecting Safety Challenges of LLMs in
  Multilingual Contexts
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
Lingfeng Shen
Weiting Tan
Sihao Chen
Yunmo Chen
Jingyu Zhang
Haoran Xu
Boyuan Zheng
Philipp Koehn
Daniel Khashabi
92
49
0
23 Jan 2024
Seed-Guided Fine-Grained Entity Typing in Science and Engineering
  Domains
Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains
Yu Zhang
Yunyi Zhang
Yanzhen Shen
Yu Deng
Lucian Popa
Larisa Shwartz
ChengXiang Zhai
Jiawei Han
74
4
0
23 Jan 2024
HAZARD Challenge: Embodied Decision Making in Dynamically Changing
  Environments
HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments
Qinhong Zhou
Sunli Chen
Yisong Wang
Haozhe Xu
Weihua Du
Hongxin Zhang
Yilun Du
Josh Tenenbaum
Chuang Gan
AI4CE
81
18
0
23 Jan 2024
A Safe Reinforcement Learning Algorithm for Supervisory Control of Power
  Plants
A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants
Yixuan Sun
Sami Khairy
Richard B. Vilim
Rui Hu
Akshay J. Dave
101
4
0
23 Jan 2024
From Understanding to Utilization: A Survey on Explainability for Large
  Language Models
From Understanding to Utilization: A Survey on Explainability for Large Language Models
Haoyan Luo
Lucia Specia
136
25
0
23 Jan 2024
Improving Machine Translation with Human Feedback: An Exploration of
  Quality Estimation as a Reward Model
Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model
Zhiwei He
Xing Wang
Wenxiang Jiao
Zhuosheng Zhang
Rui Wang
Shuming Shi
Zhaopeng Tu
ALM
113
27
0
23 Jan 2024
SLANG: New Concept Comprehension of Large Language Models
SLANG: New Concept Comprehension of Large Language Models
Lingrui Mei
Shenghua Liu
Yiwei Wang
Baolong Bi
Xueqi Chen
KELM
77
8
0
23 Jan 2024
Small Language Model Meets with Reinforced Vision Vocabulary
Small Language Model Meets with Reinforced Vision Vocabulary
Haoran Wei
Lingyu Kong
Jinyue Chen
Liang Zhao
Zheng Ge
En Yu
Jian‐Yuan Sun
Chunrui Han
Xiangyu Zhang
VLM
127
41
0
23 Jan 2024
CCA: Collaborative Competitive Agents for Image Editing
CCA: Collaborative Competitive Agents for Image Editing
Tiankai Hang
Shuyang Gu
Dong Chen
Xin Geng
Baining Guo
166
5
0
23 Jan 2024
GRATH: Gradual Self-Truthifying for Large Language Models
GRATH: Gradual Self-Truthifying for Large Language Models
Weixin Chen
Basel Alomair
Yue Liu
HILMSyDa
51
6
0
22 Jan 2024
WARM: On the Benefits of Weight Averaged Reward Models
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
196
104
0
22 Jan 2024
Revisiting Demonstration Selection Strategies in In-Context Learning
Revisiting Demonstration Selection Strategies in In-Context Learning
Keqin Peng
Liang Ding
Yancheng Yuan
Xuebo Liu
Min Zhang
Y. Ouyang
Dacheng Tao
117
35
0
22 Jan 2024
AI for social science and social science of AI: A Survey
AI for social science and social science of AI: A Survey
Ruoxi Xu
Yingfei Sun
Mengjie Ren
Shiguang Guo
Ruotong Pan
Hongyu Lin
Le Sun
Xianpei Han
107
57
0
22 Jan 2024
A Framework to Implement 1+N Multi-task Fine-tuning Pattern in LLMs
  Using the CGC-LORA Algorithm
A Framework to Implement 1+N Multi-task Fine-tuning Pattern in LLMs Using the CGC-LORA Algorithm
Chao Song
Zhihao Ye
Qiqiang Lin
Qiuying Peng
Jun Wang
109
0
0
22 Jan 2024
Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language
  Conversion for Language Models
Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language Conversion for Language Models
Yile Wang
Sijie Cheng
Zixin Sun
Peng Li
Yang Liu
ReLMLRM
95
5
0
22 Jan 2024
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and
  Generating with Multimodal LLMs
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Ling Yang
Zhaochen Yu
Chenlin Meng
Minkai Xu
Stefano Ermon
Tengjiao Wang
CoGeDiffM
136
137
0
22 Jan 2024
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
Yu Zhang
Mei Di
Haozheng Luo
Chenwei Xu
Richard Tzong-Han Tsai
111
7
0
22 Jan 2024
Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications
Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications
Yuhang Zhou
Paiheng Xu
Xiyao Wang
Xuan Lu
Ge Gao
Wei Ai
162
7
0
22 Jan 2024
In-context Learning with Retrieved Demonstrations for Language Models: A
  Survey
In-context Learning with Retrieved Demonstrations for Language Models: A Survey
an Luo
Xin Xu
Yue Liu
Panupong Pasupat
Mehran Kazemi
RALM
164
70
0
21 Jan 2024
Over-Reasoning and Redundant Calculation of Large Language Models
Over-Reasoning and Redundant Calculation of Large Language Models
Cheng-Han Chiang
Hunghuei Lee
LRM
146
13
0
21 Jan 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences
  without Tuning and Feedback
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
Songyang Gao
Qiming Ge
Wei Shen
Shihan Dou
Junjie Ye
...
Yicheng Zou
Zhi Chen
Hang Yan
Qi Zhang
Dahua Lin
95
11
0
21 Jan 2024
ProLex: A Benchmark for Language Proficiency-oriented Lexical
  Substitution
ProLex: A Benchmark for Language Proficiency-oriented Lexical Substitution
Xuanming Zhang
Zixun Chen
Zhou Yu
54
4
0
21 Jan 2024
Prompting Large Vision-Language Models for Compositional Reasoning
Prompting Large Vision-Language Models for Compositional Reasoning
Timothy Ossowski
Ming Jiang
Junjie Hu
CoGeVLMLRM
102
3
0
20 Jan 2024
Orion-14B: Open-source Multilingual Large Language Models
Orion-14B: Open-source Multilingual Large Language Models
Du Chen
Yi Huang
Xiaopu Li
Yongqiang Li
Yongqiang Liu
Haihui Pan
Leichao Xu
Dacheng Zhang
Zhipeng Zhang
Kun Han
64
4
0
20 Jan 2024
InferAligner: Inference-Time Alignment for Harmlessness through
  Cross-Model Guidance
InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model Guidance
Pengyu Wang
Dong Zhang
Linyang Li
Chenkun Tan
Xinghao Wang
Ke Ren
Botian Jiang
Xipeng Qiu
LLMSV
102
49
0
20 Jan 2024
Enhancing Large Language Models for Clinical Decision Support by
  Incorporating Clinical Practice Guidelines
Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines
David Oniani
Xizhi Wu
Shyam Visweswaran
S. Kapoor
Shravan Kooragayalu
Katelyn Polanska
Yanshan Wang
LM&MAELMAI4MH
58
12
0
20 Jan 2024
TypeDance: Creating Semantic Typographic Logos from Image through
  Personalized Generation
TypeDance: Creating Semantic Typographic Logos from Image through Personalized Generation
Shishi Xiao
Liangwei Wang
Xiaojuan Ma
Wei Zeng
100
20
0
20 Jan 2024
The Radiation Oncology NLP Database
The Radiation Oncology NLP Database
Zheng Liu
J. Holmes
Wenxiong Liao
Chenbin Liu
Hua Zhou
...
Quanzheng Li
Xiang Li
Tianming Liu
Jiajian Shen
Wei Liu
LM&MAAI4CE
79
3
0
19 Jan 2024
Reinforcement learning for question answering in programming domain
  using public community scoring as a human feedback
Reinforcement learning for question answering in programming domain using public community scoring as a human feedback
Alexey Gorbatovski
Sergey Kovalchuk
27
3
0
19 Jan 2024
Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs
  Without Fine-Tuning
Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
Adib Hasan
Ileana Rugina
Alex Wang
AAML
96
24
0
19 Jan 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple
  Decoding Heads
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
198
315
0
19 Jan 2024
Knowledge Verification to Nip Hallucination in the Bud
Knowledge Verification to Nip Hallucination in the Bud
Fanqi Wan
Xinting Huang
Leyang Cui
Xiaojun Quan
Wei Bi
Shuming Shi
HILM
64
4
0
19 Jan 2024
PHOENIX: Open-Source Language Adaption for Direct Preference
  Optimization
PHOENIX: Open-Source Language Adaption for Direct Preference Optimization
Matthias Uhlig
Sigurd Schacht
Sudarshan Kamath Barkur
ALM
57
1
0
19 Jan 2024
Large Language Models are Efficient Learners of Noise-Robust Speech
  Recognition
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Chao Zhang
Pin-Yu Chen
Ensiong Chng
102
25
0
19 Jan 2024
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning
Chenyu Wang
Weixin Luo
Qianyu Chen
Haonan Mai
Jindi Guo
Sixun Dong
Xiaohua Xuan
MLLMLLMAG
155
20
0
19 Jan 2024
Can Large Language Model Summarizers Adapt to Diverse Scientific
  Communication Goals?
Can Large Language Model Summarizers Adapt to Diverse Scientific Communication Goals?
Marcio Fonseca
Shay B. Cohen
85
12
0
18 Jan 2024
Inconsistent dialogue responses and how to recover from them
Inconsistent dialogue responses and how to recover from them
Mian Zhang
Lifeng Jin
Linfeng Song
Haitao Mi
Dong Yu
68
1
0
18 Jan 2024
Bridging Cultural Nuances in Dialogue Agents through Cultural Value
  Surveys
Bridging Cultural Nuances in Dialogue Agents through Cultural Value Surveys
Yong Cao
Min Chen
Daniel Hershcovich
124
7
0
18 Jan 2024
ChatQA: Surpassing GPT-4 on Conversational QA and RAG
ChatQA: Surpassing GPT-4 on Conversational QA and RAG
Zihan Liu
Ming-Yu Liu
Rajarshi Roy
Peng Xu
Chankyu Lee
Mohammad Shoeybi
Bryan Catanzaro
ALMRALMAI4MH
107
48
0
18 Jan 2024
LangProp: A code optimization framework using Large Language Models
  applied to driving
LangProp: A code optimization framework using Large Language Models applied to driving
Shu Ishida
Gianluca Corrado
George Fedoseev
Hudson Yeo
Lloyd Russell
Jamie Shotton
João F. Henriques
Anthony Hu
115
11
0
18 Jan 2024
Previous
123...103104105...126127128
Next