ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,390 papers shown
Title
FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models
  for Financial Applications with High-Performance Computing
FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing
Xiao-Yang Liu
Jie Zhang
Guoxuan Wang
Weiqin Tong
Anwar Elwalid
84
4
0
21 Feb 2024
AgentScope: A Flexible yet Robust Multi-Agent Platform
AgentScope: A Flexible yet Robust Multi-Agent Platform
Dawei Gao
Zitao Li
Xuchen Pan
Weirui Kuang
Zhijian Ma
...
Chen Cheng
Hongzhu Shi
Yaliang Li
Bolin Ding
Jingren Zhou
LLMAG
101
39
0
21 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
160
32
0
21 Feb 2024
The Lay Person's Guide to Biomedicine: Orchestrating Large Language
  Models
The Lay Person's Guide to Biomedicine: Orchestrating Large Language Models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
79
0
0
21 Feb 2024
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical
  Gradient Analysis
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
Yueqi Xie
Minghong Fang
Renjie Pi
Neil Zhenqiang Gong
117
36
0
21 Feb 2024
Retrieval Helps or Hurts? A Deeper Dive into the Efficacy of Retrieval
  Augmentation to Language Models
Retrieval Helps or Hurts? A Deeper Dive into the Efficacy of Retrieval Augmentation to Language Models
Seiji Maekawa
Hayate Iso
Sairam Gurajada
Nikita Bhutani
RALMKELM
110
14
0
21 Feb 2024
How Important is Domain Specificity in Language Models and Instruction
  Finetuning for Biomedical Relation Extraction?
How Important is Domain Specificity in Language Models and Instruction Finetuning for Biomedical Relation Extraction?
Aviv Brokman
Ramakanth Kavuluru
LM&MAALM
64
3
0
21 Feb 2024
STENCIL: Submodular Mutual Information Based Weak Supervision for
  Cold-Start Active Learning
STENCIL: Submodular Mutual Information Based Weak Supervision for Cold-Start Active Learning
Nathan Beck
Adithya Iyer
Rishabh K. Iyer
88
0
0
21 Feb 2024
RefuteBench: Evaluating Refuting Instruction-Following for Large
  Language Models
RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Jianhao Yan
Yun Luo
Yue Zhang
ALMLRM
105
10
0
21 Feb 2024
Potential and Challenges of Model Editing for Social Debiasing
Potential and Challenges of Model Editing for Social Debiasing
Jianhao Yan
Futing Wang
Yafu Li
Yue Zhang
KELM
126
9
0
21 Feb 2024
A Comprehensive Study of Jailbreak Attack versus Defense for Large
  Language Models
A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models
Zihao Xu
Yi Liu
Gelei Deng
Yuekang Li
S. Picek
PILMAAML
104
44
0
21 Feb 2024
Factual consistency evaluation of summarization in the Era of large language models
Factual consistency evaluation of summarization in the Era of large language models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
HILM
61
2
0
21 Feb 2024
Healthcare Copilot: Eliciting the Power of General LLMs for Medical
  Consultation
Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation
Zhiyao Ren
Yibing Zhan
Baosheng Yu
Liang Ding
Dacheng Tao
LM&MA
65
14
0
20 Feb 2024
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems
Ivan Sekulić
Silvia Terragni
Victor Guimaraes
Nghia Khau
Bruna Guedes
Modestas Filipavicius
A. Manso
Roland Mathis
53
7
0
20 Feb 2024
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Arka Pal
Deep Karkhanis
Samuel Dooley
Manley Roberts
Siddartha Naidu
Colin White
OSLM
124
155
0
20 Feb 2024
RoCode: A Dataset for Measuring Code Intelligence from Problem
  Definitions in Romanian
RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian
Adrian Cosma
Ioan-Bogdan Iordache
Paolo Rosso
OffRL
45
3
0
20 Feb 2024
Bayesian Reward Models for LLM Alignment
Bayesian Reward Models for LLM Alignment
Adam X. Yang
Maxime Robeyns
Thomas Coste
Zhengyan Shi
Jun Wang
Haitham Bou-Ammar
Laurence Aitchison
73
19
0
20 Feb 2024
Is the System Message Really Important to Jailbreaks in Large Language
  Models?
Is the System Message Really Important to Jailbreaks in Large Language Models?
Xiaotian Zou
Yongkang Chen
Ke Li
81
14
0
20 Feb 2024
TreeEval: Benchmark-Free Evaluation of Large Language Models through
  Tree Planning
TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning
Xiang Li
Yunshi Lan
Chao Yang
ELM
63
11
0
20 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELMVLM
175
135
0
20 Feb 2024
Event-level Knowledge Editing
Event-level Knowledge Editing
Hao Peng
Xiaozhi Wang
Chunyang Li
Kaisheng Zeng
Jiangshan Duo
Yixin Cao
Lei Hou
Juanzi Li
KELM
95
7
0
20 Feb 2024
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
  Language Models
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Haoran Li
Qingxiu Dong
Zhengyang Tang
Chaojun Wang
Xingxing Zhang
...
Wei Lu
Zhifang Sui
Benyou Wang
Wai Lam
Furu Wei
SyDa
107
63
0
20 Feb 2024
Stable Knowledge Editing in Large Language Models
Stable Knowledge Editing in Large Language Models
Zihao Wei
Liang Pang
Hanxing Ding
Jingcheng Deng
Huawei Shen
Xueqi Cheng
KELM
119
10
0
20 Feb 2024
CFEVER: A Chinese Fact Extraction and VERification Dataset
CFEVER: A Chinese Fact Extraction and VERification Dataset
Ying-Jia Lin
Chun Lin
Chia-Jen Yeh
Yi-Ting Li
Yun-Yu Hu
Chih-Hao Hsu
Mei-Feng Lee
Hung-Yu Kao
HILM
76
5
0
20 Feb 2024
Comparing Inferential Strategies of Humans and Large Language Models in
  Deductive Reasoning
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
Philipp Mondorf
Barbara Plank
LRM
129
10
0
20 Feb 2024
The Impact of Demonstrations on Multilingual In-Context Learning: A
  Multidimensional Analysis
The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis
Miaoran Zhang
Vagrant Gautam
Mingyang Wang
Jesujoba Oluwadara Alabi
Xiaoyu Shen
Dietrich Klakow
Marius Mosbach
106
12
0
20 Feb 2024
GlórIA -- A Generative and Open Large Language Model for Portuguese
GlórIA -- A Generative and Open Large Language Model for Portuguese
Ricardo Lopes
João Magalhães
David Semedo
61
8
0
20 Feb 2024
Exploring the Impact of Table-to-Text Methods on Augmenting LLM-based
  Question Answering with Domain Hybrid Data
Exploring the Impact of Table-to-Text Methods on Augmenting LLM-based Question Answering with Domain Hybrid Data
Dehai Min
Nan Hu
Rihui Jin
Nuo Lin
Jiaoyan Chen
...
Yu Li
Guilin Qi
Yun Li
Nijun Li
Qianren Wang
LMTD
79
17
0
20 Feb 2024
Instruction-tuned Language Models are Better Knowledge Learners
Instruction-tuned Language Models are Better Knowledge Learners
Zhengbao Jiang
Zhiqing Sun
Weijia Shi
Pedro Rodriguez
Chunting Zhou
Graham Neubig
Xi Lin
Wen-tau Yih
Srinivasan Iyer
KELM
92
42
0
20 Feb 2024
ConVQG: Contrastive Visual Question Generation with Multimodal Guidance
ConVQG: Contrastive Visual Question Generation with Multimodal Guidance
Li Mi
Syrielle Montariol
J. Castillo-Navarro
Xianjie Dai
Antoine Bosselut
D. Tuia
59
4
0
20 Feb 2024
PromptKD: Distilling Student-Friendly Knowledge for Generative Language
  Models via Prompt Tuning
PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning
Gyeongman Kim
Doohyuk Jang
Eunho Yang
VLM
112
13
0
20 Feb 2024
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic
Fajri Koto
Haonan Li
Sara Shatnawi
Jad Doughman
Abdelrahman Boda Sadallah
...
Neha Sengupta
Shady Shehata
Nizar Habash
Preslav Nakov
Timothy Baldwin
ELMLRM
155
44
0
20 Feb 2024
PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of
  LLMs
PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs
An Liu
Zonghan Yang
Zhenhe Zhang
Qingyuan Hu
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
ALM
60
2
0
20 Feb 2024
Tree-Planted Transformers: Unidirectional Transformer Language Models
  with Implicit Syntactic Supervision
Tree-Planted Transformers: Unidirectional Transformer Language Models with Implicit Syntactic Supervision
Ryosuke Yoshida
Taiga Someya
Yohei Oseki
91
0
0
20 Feb 2024
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Runlong Zhou
Simon S. Du
Beibin Li
OffRL
89
4
0
20 Feb 2024
Generative AI Security: Challenges and Countermeasures
Generative AI Security: Challenges and Countermeasures
Banghua Zhu
Norman Mu
Jiantao Jiao
David Wagner
AAMLSILM
107
10
0
20 Feb 2024
A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion
A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion
Yanzhen Shen
Yu Zhang
Yunyi Zhang
Jiawei Han
128
10
0
20 Feb 2024
Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance
Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance
Branislav Pecher
Ivan Srba
Maria Bielikova
ALM
100
8
0
20 Feb 2024
Defending Jailbreak Prompts via In-Context Adversarial Game
Defending Jailbreak Prompts via In-Context Adversarial Game
Yujun Zhou
Yufei Han
Haomin Zhuang
Kehan Guo
Zhenwen Liang
Hongyan Bao
Xiangliang Zhang
LLMAGAAML
117
15
0
20 Feb 2024
Roadmap on Incentive Compatibility for AI Alignment and Governance in Sociotechnical Systems
Roadmap on Incentive Compatibility for AI Alignment and Governance in Sociotechnical Systems
Zhaowei Zhang
Fengshuo Bai
Mingzhi Wang
Haoyang Ye
Chengdong Ma
Yaodong Yang
77
6
0
20 Feb 2024
Standardize: Aligning Language Models with Expert-Defined Standards for
  Content Generation
Standardize: Aligning Language Models with Expert-Defined Standards for Content Generation
Joseph Marvin Imperial
Gail Forey
Harish Tayyar Madabushi
ALM
70
3
0
19 Feb 2024
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of
  Large Language Models
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models
Loka Li
Zhenhao Chen
Guan-Hong Chen
Yixuan Zhang
Yusheng Su
Eric P. Xing
Kun Zhang
LRM
93
19
0
19 Feb 2024
The Revolution of Multimodal Large Language Models: A Survey
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni
Federico Cocchi
Luca Barsellotti
Nicholas Moratelli
Sara Sarto
Lorenzo Baraldi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
LRMVLM
139
64
0
19 Feb 2024
HunFlair2 in a cross-corpus evaluation of biomedical named entity
  recognition and normalization tools
HunFlair2 in a cross-corpus evaluation of biomedical named entity recognition and normalization tools
Mario Sanger
Samuele Garda
Xing David Wang
Leon Weber-Genzel
Pia Droop
Benedikt Fuchs
Alan Akbik
Ulf Leser
83
7
0
19 Feb 2024
A Critical Evaluation of AI Feedback for Aligning Large Language Models
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Archit Sharma
Sedrick Scott Keh
Eric Mitchell
Chelsea Finn
Kushal Arora
Thomas Kollar
ALMLLMAG
106
27
0
19 Feb 2024
Emulated Disalignment: Safety Alignment for Large Language Models May
  Backfire!
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Zhanhui Zhou
Jie Liu
Zhichen Dong
Jiaheng Liu
Chao Yang
Wanli Ouyang
Yu Qiao
96
22
0
19 Feb 2024
KARL: Knowledge-Aware Retrieval and Representations aid Retention and
  Learning in Students
KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students
Matthew Shu
Nishant Balepur
Shi Feng
Jordan L. Boyd-Graber
AI4EdHAI
28
5
0
19 Feb 2024
Tables as Texts or Images: Evaluating the Table Reasoning Ability of
  LLMs and MLLMs
Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs
Naihao Deng
Zhenjie Sun
Ruiqi He
Aman Sikka
Yulong Chen
Lin Ma
Yue Zhang
Rada Mihalcea
LMTD
83
19
0
19 Feb 2024
High-quality Data-to-Text Generation for Severely Under-Resourced
  Languages with Out-of-the-box Large Language Models
High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models
Michela Lorandi
Anya Belz
60
5
0
19 Feb 2024
Task-Oriented Dialogue with In-Context Learning
Task-Oriented Dialogue with In-Context Learning
Tom Bocklisch
Thomas Werkmeister
Daksh Varshneya
Alan Nichol
69
6
0
19 Feb 2024
Previous
123...959697...126127128
Next