ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,370 papers shown
Title
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
335
54
0
23 May 2024
Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization
Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization
Yihao Huang
Chong Wang
Xiaojun Jia
Qing Guo
Felix Juefei Xu
Jian Zhang
G. Pu
Yang Liu
109
9
0
23 May 2024
Your Large Language Models Are Leaving Fingerprints
Your Large Language Models Are Leaving Fingerprints
Hope McGovern
Rickard Stureborg
Yoshi Suhara
Dimitris Alikaniotis
DeLMO
93
14
0
22 May 2024
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous
  Obfuscation in Query and Response
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response
Tianrong Zhang
Bochuan Cao
Yuanpu Cao
Lu Lin
Prasenjit Mitra
Jinghui Chen
AAML
106
12
0
22 May 2024
Why Not Transform Chat Large Language Models to Non-English?
Why Not Transform Chat Large Language Models to Non-English?
Xiang Geng
Ming Zhu
Jiahuan Li
Zhejian Lai
Wei Zou
...
Xinglin Lyu
Min Zhang
Jiajun Chen
Hao Yang
Shujian Huang
68
2
0
22 May 2024
Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal
  Large Language Models
Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models
Qiji Zhou
Ruochen Zhou
Zike Hu
Panzhong Lu
Siyang Gao
Yue Zhang
LRM
106
17
0
22 May 2024
Dense Connector for MLLMs
Dense Connector for MLLMs
Huanjin Yao
Wenhao Wu
Taojiannan Yang
Yuxin Song
Mengxi Zhang
Haocheng Feng
Yifan Sun
Zhiheng Li
Wanli Ouyang
Jingdong Wang
MLLMVLM
102
25
0
22 May 2024
Do Language Models Enjoy Their Own Stories? Prompting Large Language
  Models for Automatic Story Evaluation
Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation
Cyril Chhun
Fabian M. Suchanek
Chloé Clavel
LRM
114
18
0
22 May 2024
Annotation-Efficient Preference Optimization for Language Model
  Alignment
Annotation-Efficient Preference Optimization for Language Model Alignment
Yuu Jinnai
Ukyo Honda
65
0
0
22 May 2024
LIRE: listwise reward enhancement for preference alignment
LIRE: listwise reward enhancement for preference alignment
Mingye Zhu
Yi Liu
Lei Zhang
Junbo Guo
Zhendong Mao
56
7
0
22 May 2024
Class-Conditional self-reward mechanism for improved Text-to-Image
  models
Class-Conditional self-reward mechanism for improved Text-to-Image models
Safouane El Ghazouali
Arnaud Gucciardi
Umberto Michelucci
EGVM
46
0
0
22 May 2024
Distilling Instruction-following Abilities of Large Language Models with
  Task-aware Curriculum Planning
Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning
Yuanhao Yue
Chengyu Wang
Jun Huang
Peng Wang
ALM
54
9
0
22 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via
  Alignment Tax Reduction
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
166
13
0
22 May 2024
360Zhinao Technical Report
360Zhinao Technical Report
360Zhinao Team
67
0
0
22 May 2024
More Distinctively Black and Feminine Faces Lead to Increased
  Stereotyping in Vision-Language Models
More Distinctively Black and Feminine Faces Lead to Increased Stereotyping in Vision-Language Models
Messi H.J. Lee
Jacob M. Montgomery
Calvin K. Lai
VLM
71
0
0
22 May 2024
DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying
DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying
Guanghui Wang
Dexi Liu
Jian-Yun Nie
Qizhi Wan
Rong Hu
Xiping Liu
Wanlong Liu
Jiaming Liu
334
0
0
22 May 2024
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
N. Sebe
Mubarak Shah
EGVM
203
7
0
22 May 2024
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Ziqiao Ma
Zekun Wang
Joyce Chai
146
4
0
22 May 2024
How Reliable AI Chatbots are for Disease Prediction from Patient
  Complaints?
How Reliable AI Chatbots are for Disease Prediction from Patient Complaints?
Ayesha Siddika Nipu
Sajjadul Islam
Praveen Madiraju
LM&MAAI4MH
57
4
0
21 May 2024
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in
  LLMs
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs
Bilgehan Sel
Priya Shanmugasundaram
Mohammad Kachuee
Kun Zhou
Ruoxi Jia
Ming Jin
LRM
53
3
0
21 May 2024
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data
  Selection for Machine Translation
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation
Xingyuan Pan
Luyang Huang
Liyan Kang
Zhicheng Liu
Yu Lu
Shanbo Cheng
ALM
135
15
0
21 May 2024
An Empirical Study and Analysis of Text-to-Image Generation Using Large
  Language Model-Powered Textual Representation
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation
Zhiyu Tan
Mengping Yang
Luozheng Qin
Hao Yang
Ye Qian
Qiang-feng Zhou
Cheng Zhang
Hao Li
108
6
0
21 May 2024
Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with
  Minimal Impact on Coherence and Evasiveness in Dialogue Agents
Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents
San Kim
Gary Geunbae Lee
AAML
126
3
0
21 May 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
ALMLM&MAELMLRM
123
59
0
21 May 2024
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit
  Reward Modeling
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Xingzhou Lou
Junge Zhang
Jian Xie
Lifeng Liu
Dong Yan
Kaiqi Huang
96
13
0
21 May 2024
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
Minbyul Jeong
Hyeon Hwang
Chanwoong Yoon
Taewhoo Lee
Jaewoo Kang
MedImHILMLM&MA
123
12
0
21 May 2024
A Survey on Multi-modal Machine Translation: Tasks, Methods and
  Challenges
A Survey on Multi-modal Machine Translation: Tasks, Methods and Challenges
Huangjun Shen
Liangying Shao
Wenbo Li
Zhibin Lan
Zhanyu Liu
Jinsong Su
92
3
0
21 May 2024
Tagengo: A Multilingual Chat Dataset
Tagengo: A Multilingual Chat Dataset
P. Devine
78
3
0
21 May 2024
The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented
  Generation (FutureDial-RAG)
The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG)
Yucheng Cai
Si Chen
Yi Huang
Junlan Feng
Zhijian Ou
105
2
0
21 May 2024
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
Yi Cheng
Ziwei Xu
Dongyun Lin
Harry Cheng
Yongkang Wong
Ying Sun
Joo Hwee Lim
Mohan Kankanhalli
80
0
0
21 May 2024
Diverse and Effective Synthetic Data Generation for Adaptable Zero-Shot
  Dialogue State Tracking
Diverse and Effective Synthetic Data Generation for Adaptable Zero-Shot Dialogue State Tracking
James D. Finch
Jinho D. Choi
70
0
0
21 May 2024
A Unified Linear Programming Framework for Offline Reward Learning from
  Human Demonstrations and Feedback
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim
Jiawei Zhang
Asuman Ozdaglar
P. Parrilo
OffRL
147
2
0
20 May 2024
Diffusion for World Modeling: Visual Details Matter in Atari
Diffusion for World Modeling: Visual Details Matter in Atari
Eloi Alonso
Adam Jelley
Vincent Micheli
Anssi Kanervisto
Amos Storkey
Tim Pearce
Franccois Fleuret
112
69
0
20 May 2024
Fennec: Fine-grained Language Model Evaluation and Correction Extended
  through Branching and Bridging
Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging
Xiaobo Liang
Haoke Zhang
Helan hu
Juntao Li
Jun Xu
Min Zhang
ALM
77
3
0
20 May 2024
Prompt Learning for Generalized Vehicle Routing
Prompt Learning for Generalized Vehicle Routing
Fei Liu
Xi Lin
Weiduo Liao
Zhenkun Wang
Qingfu Zhang
Xialiang Tong
Mingxuan Yuan
VLM
86
3
0
20 May 2024
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Zhenwei Shao
Zhou Yu
Jun Yu
Xuecheng Ouyang
Lihao Zheng
Zhenbiao Gai
Mingyang Wang
Jiajun Ding
67
11
0
20 May 2024
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information
  Needs in Large Language Models
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models
Tong Zhang
Peixin Qin
Yang Deng
Chen Huang
Wenqiang Lei
Junhong Liu
Dingnan Jin
Hongru Liang
Tat-Seng Chua
71
13
0
20 May 2024
Data Contamination Calibration for Black-box LLMs
Data Contamination Calibration for Black-box LLMs
Wen-song Ye
Jiaqi Hu
Liyao Li
Haobo Wang
Gang Chen
Junbo Zhao
64
9
0
20 May 2024
ARAIDA: Analogical Reasoning-Augmented Interactive Data Annotation
ARAIDA: Analogical Reasoning-Augmented Interactive Data Annotation
Chen Huang
Yiping Jin
Ilija Ilievski
Wenqiang Lei
Jiancheng Lv
87
3
0
20 May 2024
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI:
  The First Romanian Natural Language Inference Corpus
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus
Eduard Poesina
Cornelia Caragea
Radu Tudor Ionescu
80
6
0
20 May 2024
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single
  Process
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
Ermo Hua
Biqing Qi
Kaiyan Zhang
Yue Yu
Ning Ding
Xingtai Lv
Kai Tian
Bowen Zhou
78
5
0
20 May 2024
TinyLLaVA Factory: A Modularized Codebase for Small-scale Large
  Multimodal Models
TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models
Junlong Jia
Ying Hu
Xi Weng
Yiming Shi
Miao Li
...
Baichuan Zhou
Ziyu Liu
Jie Luo
Lei Huang
Ji Wu
97
9
0
20 May 2024
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Jian Hu
Xibin Wu
Wei Shen
OpenLLMAI Team
Dehao Zhang
...
Weikai Fang
Xianyu
Yu Cao
Haotian Xu
Yiming Liu
VLMAI4CE
136
130
0
20 May 2024
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
Minghao Wu
Jiahao Xu
Yulin Yuan
Gholamreza Haffari
Longyue Wang
Weihua Luo
Kaifu Zhang
LLMAG
184
27
0
20 May 2024
Your Transformer is Secretly Linear
Your Transformer is Secretly Linear
Anton Razzhigaev
Matvey Mikhalchuk
Elizaveta Goncharova
Nikolai Gerasimenko
Ivan Oseledets
Denis Dimitrov
Andrey Kuznetsov
81
6
0
19 May 2024
Hummer: Towards Limited Competitive Preference Dataset
Hummer: Towards Limited Competitive Preference Dataset
Li Jiang
Yusen Wu
Junwu Xiong
Jingqing Ruan
Yichuan Ding
Qingpei Guo
ZuJie Wen
Jun Zhou
Xiaotie Deng
91
7
0
19 May 2024
Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive
  Learning for Text Classification
Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification
Mengyu Li
Yuqi Liu
Fausto Giunchiglia
Xiaoyue Feng
Renchu Guan
VLM
114
5
0
19 May 2024
SemEval-2024 Task 3: Multimodal Emotion Cause Analysis in Conversations
SemEval-2024 Task 3: Multimodal Emotion Cause Analysis in Conversations
Fanfan Wang
Heqing Ma
Jianfei Yu
Rui Xia
Erik Cambria
103
24
0
19 May 2024
Comparisons Are All You Need for Optimizing Smooth Functions
Comparisons Are All You Need for Optimizing Smooth Functions
Chenyi Zhang
Tongyang Li
AAML
110
2
0
19 May 2024
MHPP: Exploring the Capabilities and Limitations of Language Models
  Beyond Basic Code Generation
MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation
Jianbo Dai
Jianqiao Lu
Yunlong Feng
Rongju Ruan
Ming Cheng
Haochen Tan
Zhijiang Guo
ELMLRM
113
16
0
19 May 2024
Previous
123...757677...126127128
Next