Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,372 papers shown
Title
RLHF Workflow: From Reward Modeling to Online RLHF
Hanze Dong
Wei Xiong
Bo Pang
Haoxiang Wang
Han Zhao
Yingbo Zhou
Nan Jiang
Doyen Sahoo
Caiming Xiong
Tong Zhang
OffRL
91
132
0
13 May 2024
FreeVA: Offline MLLM as Training-Free Video Assistant
Wenhao Wu
VLM
OffRL
87
20
0
13 May 2024
LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language
Cagri Toraman
VLM
116
5
0
13 May 2024
Quantifying and Optimizing Global Faithfulness in Persona-driven Role-playing
Letian Peng
Jingbo Shang
102
3
0
13 May 2024
UCCIX: Irish-eXcellence Large Language Model
Khanh-Tung Tran
Barry O'Sullivan
Hoang D. Nguyen
61
6
0
13 May 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
81
4
0
13 May 2024
Evaluation of Retrieval-Augmented Generation: A Survey
Hao Yu
Aoran Gan
Kai Zhang
Shiwei Tong
Qi Liu
Zhaofeng Liu
3DV
140
100
0
13 May 2024
Can Language Models Explain Their Own Classification Behavior?
Dane Sherburn
Bilal Chughtai
Owain Evans
64
1
0
13 May 2024
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
126
0
0
13 May 2024
Separable Power of Classical and Quantum Learning Protocols Through the Lens of No-Free-Lunch Theorem
Xinbiao Wang
Yuxuan Du
Kecheng Liu
Yong Luo
Bo Du
Dacheng Tao
64
1
0
12 May 2024
Large Language Models for Education: A Survey
Hanyi Xu
Wensheng Gan
Zhenlian Qi
Jiayang Wu
Philip S. Yu
AI4Ed
ELM
148
18
0
12 May 2024
Integrating Emotional and Linguistic Models for Ethical Compliance in Large Language Models
Edward Y. Chang
40
3
0
11 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
80
2
0
11 May 2024
Automating Creativity
Ming-Hui Huang
R. Rust
105
0
0
11 May 2024
AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents
Shuyuan Xu
Zelong Li
Kai Mei
Yongfeng Zhang
75
5
0
11 May 2024
Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs
Yao Lai
Jinxin Liu
Yao Lai
Ping Luo
119
5
0
10 May 2024
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
JoonHo Lee
Jae Oh Woo
Juree Seok
Parisa Hassanzadeh
Wooseok Jang
...
Hankyu Moon
Wenjun Hu
Yeong-Dae Kwon
Taehee Lee
Seungjai Min
144
2
0
10 May 2024
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL
Ning Cheng
Zhaohui Yan
Ziming Wang
Zhijie Li
Jiaming Yu
Zilong Zheng
Kewei Tu
Jinan Xu
Wenjuan Han
56
6
0
10 May 2024
LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play
Li-Chun Lu
Shou-Jen Chen
Tsung-Min Pai
Chan-Hung Yu
Hung-yi Lee
Shao-Hua Sun
LLMAG
98
50
0
10 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
148
137
0
09 May 2024
Efficient LLM Comparative Assessment: a Product of Experts Framework for Pairwise Comparisons
Adian Liusie
Vatsal Raina
Yassir Fathullah
Mark Gales
104
12
0
09 May 2024
Can large language models understand uncommon meanings of common words?
Jinyang Wu
Feihu Che
Xinxin Zheng
Shuai Zhang
Ruihan Jin
Shuai Nie
Pengpeng Shao
Jianhua Tao
80
4
0
09 May 2024
Evaluating Dialect Robustness of Language Models via Conversation Understanding
Dipankar Srirag
Aditya Joshi
102
3
0
09 May 2024
Redefining Information Retrieval of Structured Database via Large Language Models
Mingzhu Wang
Yuzhe Zhang
Qihang Zhao
Juanyi Yang
Kuanqi Cai
RALM
KELM
91
0
0
09 May 2024
Using Machine Translation to Augment Multilingual Classification
Adam King
85
0
0
09 May 2024
Mitigating Exaggerated Safety in Large Language Models
Ruchi Bhalani
Ruchira Ray
64
2
0
08 May 2024
The Effect of Model Size on LLM Post-hoc Explainability via LIME
Henning Heyen
Amy Widdicombe
Noah Y. Siegel
Maria Perez-Ortiz
Philip C. Treleaven
LRM
89
1
0
08 May 2024
Benchmarking Educational Program Repair
Charles Koutcheme
Nicola Dainese
Sami Sarsa
Juho Leinonen
Arto Hellas
Paul Denny
AI4Ed
86
5
0
08 May 2024
KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Minsik Cho
Mohammad Rastegari
Devang Naik
80
4
0
08 May 2024
QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs
Weijia Zhang
Vaishali Pal
Jia-Hong Huang
Evangelos Kanoulas
Maarten de Rijke
LMTD
109
8
0
08 May 2024
ADELIE: Aligning Large Language Models on Information Extraction
Yunjia Qi
Hao Peng
Xiaozhi Wang
Bin Xu
Lei Hou
Juanzi Li
102
11
0
08 May 2024
Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models
Zhengxing Lan
Hongbo Li
Lingshan Liu
Bo Fan
Yisheng Lv
Yilong Ren
Zhiyong Cui
95
23
0
08 May 2024
CourseGPT-zh: an Educational Large Language Model Based on Knowledge Distillation Incorporating Prompt Optimization
Zheyan Qu
Lu Yin
Zitong Yu
Wenbo Wang
Xing Zhang
ALM
56
2
0
08 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
119
43
0
08 May 2024
Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking
Emre Can Acikgoz
Mete Erdogan
Deniz Yuret
86
8
0
07 May 2024
ACEGEN: Reinforcement learning of generative chemical agents for drug discovery
Albert Bou
Morgan Thomas
Sebastian Dittert
Carles Navarro Ramírez
Maciej Majewski
...
Mazen Ahmad
Vincent Moens
Woody Sherman
Simone Sciabola
Gianni De Fabritiis
98
9
0
07 May 2024
NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts
Shudan Zhang
Hanlin Zhao
Xiao Liu
Qinkai Zheng
Zehan Qi
Xiaotao Gu
Xiaohan Zhang
Yuxiao Dong
Jie Tang
ELM
114
17
0
07 May 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek-AI
Aixin Liu
Bei Feng
Bin Wang
Bingxuan Wang
...
Zhuoshu Li
Zihan Wang
Zihui Gu
Zilin Li
Ziwei Xie
MoE
170
500
0
07 May 2024
Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Georgios Pantazopoulos
Amit Parekh
Malvina Nikandrou
Alessandro Suglia
113
5
0
07 May 2024
Iterative Experience Refinement of Software-Developing Agents
Cheng Qian
Jiahao Li
Yufan Dang
Wei Liu
Yifei Wang
...
Weize Chen
Cheng Yang
Yingli Zhang
Zhiyuan Liu
Maosong Sun
LLMAG
71
13
0
07 May 2024
A Causal Explainable Guardrails for Large Language Models
Zhixuan Chu
Yan Wang
Longfei Li
Peng Kuang
Zhan Qin
Kui Ren
LLMSV
97
9
0
07 May 2024
ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios
Dingrui Wang
Zheyuan Lai
Yuda Li
Yi Wu
Yuexin Ma
Johannes Betz
Ruigang Yang
Wei Li
47
1
0
07 May 2024
Optimizing Language Model's Reasoning Abilities with Weak Supervision
Yongqi Tong
Sizhe Wang
Dawei Li
Yifan Wang
Simeng Han
Zi Lin
Chengsong Huang
Jiaxin Huang
Jingbo Shang
LRM
ReLM
99
10
0
07 May 2024
POV Learning: Individual Alignment of Multimodal Models using Human Perception
Simon Werner
Katharina Christ
Laura Bernardy
Marion G. Müller
Achim Rettinger
33
0
0
07 May 2024
pFedLVM: A Large Vision Model (LVM)-Driven and Latent Feature-Based Personalized Federated Learning Framework in Autonomous Driving
Wei-Bin Kou
Qingfeng Lin
Ming Tang
Sheng Xu
Rongguang Ye
...
Shuai Wang
Guofa Li
Zhenyu Chen
Guangxu Zhu
Yik-Chung Wu
FedML
121
13
0
07 May 2024
In Situ AI Prototyping: Infusing Multimodal Prompts into Mobile Settings with MobileMaker
S. Petridis
Michael Xieyang Liu
Alexander J. Fiannaca
Vivian Tsai
Michael Terry
Carrie J. Cai
90
0
0
06 May 2024
MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference Optimization
Massimiliano Pappa
Luca Collorone
Giovanni Ficarra
Indro Spinelli
Yuta Kyuragi
66
2
0
06 May 2024
Can LLMs Deeply Detect Complex Malicious Queries? A Framework for Jailbreaking via Obfuscating Intent
Shang Shang
Xinqiang Zhao
Zhongjiang Yao
Yepeng Yao
Liya Su
Zijing Fan
Xiaodan Zhang
Zhengwei Jiang
113
6
0
06 May 2024
Generated Contents Enrichment
Mahdi Naseri
Jiayan Qiu
Zhou Wang
109
0
0
06 May 2024
When LLMs Meet Cybersecurity: A Systematic Literature Review
Jie Zhang
Haoyu Bu
Hui Wen
Yu Chen
Lun Li
Hongsong Zhu
143
47
0
06 May 2024
Previous
1
2
3
...
77
78
79
...
126
127
128
Next