Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,390 papers shown
Title
Studying LLM Performance on Closed- and Open-source Data
Toufique Ahmed
Christian Bird
Prem Devanbu
Saikat Chakraborty
96
9
0
23 Feb 2024
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
Zhuohao Yu
Chang Gao
Wenjin Yao
Yidong Wang
Wei Ye
Jindong Wang
Xing Xie
Yue Zhang
Shikun Zhang
90
28
0
23 Feb 2024
Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models
Guanming Xiong
Junwei Bao
Wen Zhao
KELM
151
13
0
23 Feb 2024
Evaluating the Performance of ChatGPT for Spam Email Detection
Shijing Si
Yuwei Wu
Jiawen Gu
Yugui Zhang
Jedrek Wosik
Qinliang Su
136
9
0
23 Feb 2024
Unintended Impacts of LLM Alignment on Global Representation
Michael Joseph Ryan
William B. Held
Diyi Yang
116
42
0
22 Feb 2024
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Zhuofeng Wu
Richard He Bai
Aonan Zhang
Jiatao Gu
V. Vydiswaran
Navdeep Jaitly
Yizhe Zhang
LRM
111
12
0
22 Feb 2024
Optimizing Language Models for Human Preferences is a Causal Inference Problem
Victoria Lin
Eli Ben-Michael
Louis-Philippe Morency
105
5
0
22 Feb 2024
Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
Jiong Wang
Jiazhao Li
Yiquan Li
Xiangyu Qi
Junjie Hu
Yixuan Li
P. McDaniel
Muhao Chen
Bo Li
Chaowei Xiao
AAML
SILM
113
22
0
22 Feb 2024
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking
Nikhil Prakash
Tamar Rott Shaham
Tal Haklay
Yonatan Belinkov
David Bau
99
67
0
22 Feb 2024
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
Zicheng Lin
Zhibin Gou
Tian Liang
Ruilin Luo
Haowei Liu
Yujiu Yang
LRM
109
56
0
22 Feb 2024
Identifying Multiple Personalities in Large Language Models with External Evaluation
Xiaoyang Song
Yuta Adachi
Jessie Feng
Mouwei Lin
Linhao Yu
Frank Li
Akshat Gupta
Gopala Anumanchipalli
Simerjot Kaur
LLMAG
88
8
0
22 Feb 2024
Watermarking Makes Language Models Radioactive
Tom Sander
Pierre Fernandez
Alain Durmus
Matthijs Douze
Teddy Furon
WaLM
82
20
0
22 Feb 2024
Zero-shot cross-lingual transfer in instruction tuning of large language models
Nadezhda Chirkova
Vassilina Nikoulina
LRM
82
4
0
22 Feb 2024
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
Yuhang Cao
Pan Zhang
Xiao-wen Dong
Dahua Lin
Jiaqi Wang
82
12
0
22 Feb 2024
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Ge Bai
Jie Liu
Xingyuan Bu
Yancheng He
Jiaheng Liu
...
Zhuoran Lin
Wenbo Su
Tiezheng Ge
Bo Zheng
Wanli Ouyang
ELM
LM&MA
125
94
0
22 Feb 2024
Generalizing Reward Modeling for Out-of-Distribution Preference Learning
Chen Jia
83
2
0
22 Feb 2024
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus
Honghao Gui
Lin Yuan
Hongbin Ye
Ningyu Zhang
Mengshu Sun
Lei Liang
Huajun Chen
91
11
0
22 Feb 2024
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Kenneth Li
Samy Jelassi
Hugh Zhang
Sham Kakade
Martin Wattenberg
David Brandfonbrener
138
11
0
22 Feb 2024
Whose LLM is it Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard
Ariel Rosenfeld
T. Lazebnik
DeLMO
84
9
0
22 Feb 2024
Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance
Ziqi Yin
Hao Wang
Kaito Horio
Daisuke Kawahara
Satoshi Sekine
116
29
0
22 Feb 2024
"My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models
Xinpeng Wang
Bolei Ma
Chengzhi Hu
Leon Weber-Genzel
Paul Röttger
Frauke Kreuter
Dirk Hovy
Barbara Plank
83
46
0
22 Feb 2024
Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment Pre-training for Noisy Slot Filling Task
Jinxu Zhao
Guanting Dong
Yueyan Qiu
Tingfeng Hui
Xiaoshuai Song
Daichi Guo
Weiran Xu
66
2
0
22 Feb 2024
Towards Robust Instruction Tuning on Multimodal Large Language Models
Wei Han
Hui Chen
Soujanya Poria
MLLM
80
1
0
22 Feb 2024
Is ChatGPT the Future of Causal Text Mining? A Comprehensive Evaluation and Analysis
Takehiro Takayanagi
Masahiro Suzuki
Ryotaro Kobayashi
Hiroki Sakaji
Kiyoshi Izumi
87
1
0
22 Feb 2024
Do LLMs Implicitly Determine the Suitable Text Difficulty for Users?
Seiji Gobara
Hidetaka Kamigaito
Taro Watanabe
79
4
0
22 Feb 2024
INSTRUCTIR: A Benchmark for Instruction Following of Information Retrieval Models
Hanseok Oh
Hyunji Lee
Seonghyeon Ye
Haebin Shin
Hansol Jang
Changwook Jun
Minjoon Seo
136
22
0
22 Feb 2024
CEV-LM: Controlled Edit Vector Language Model for Shaping Natural Language Generations
Samraj Moorjani
A. Krishnan
Hari Sundaram
KELM
74
1
0
22 Feb 2024
Double-I Watermark: Protecting Model Copyright for LLM Fine-tuning
Shen Li
Liuyi Yao
Jinyang Gao
Lan Zhang
Yaliang Li
123
13
0
22 Feb 2024
Qsnail: A Questionnaire Dataset for Sequential Question Generation
Yan Lei
Liang Pang
Yuanzhuo Wang
Huawei Shen
Xueqi Cheng
60
0
0
22 Feb 2024
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming
Anisha Agarwal
Aaron Chan
Shubham Chandel
Jinu Jang
Shaun Miller
Roshanak Zilouchian Moghaddam
Yevhen Mohylevskyy
Neel Sundaresan
Michele Tufano
ELM
61
17
0
22 Feb 2024
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Zhiyuan Wang
Jinhao Duan
Chenxi Yuan
Qingyu Chen
Tianlong Chen
Huaxiu Yao
Yue Zhang
Ren Wang
Kaidi Xu
Xiaoshuang Shi
UQLM
181
13
0
22 Feb 2024
Eagle: Ethical Dataset Given from Real Interactions
Masahiro Kaneko
Danushka Bollegala
Timothy Baldwin
75
4
0
22 Feb 2024
MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback and Dynamic Distance Constraint
Xinglin Zhou
Yifu Yuan
Shaofu Yang
Jianye Hao
77
2
0
22 Feb 2024
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
Yijia Shao
Yucheng Jiang
Theodore A. Kanell
Peter Xu
Omar Khattab
Monica S. Lam
LLMAG
KELM
120
51
0
22 Feb 2024
From Adoption to Adaption: Tracing the Diffusion of New Emojis on Twitter
Yuhang Zhou
Xuan Lu
Wei Ai
90
2
0
22 Feb 2024
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
Ashok Urlana
Charaka Vinayak Kumar
Ajeet Kumar Singh
B. Garlapati
S. Chalamala
Rahul Mishra
124
8
0
22 Feb 2024
EXACT-Net:EHR-guided lung tumor auto-segmentation for non-small cell lung cancer radiotherapy
H. Hooshangnejad
Xue Feng
Gaofeng Huang
Rui Zhang
Quan Chen
Kai Ding
53
5
0
21 Feb 2024
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Lucas Lehnert
Sainbayar Sukhbaatar
DiJia Su
Qinqing Zheng
Paul Mcvay
Michael Rabbat
Yuandong Tian
121
65
0
21 Feb 2024
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping
Alex Stein
Manli Shu
Khalid Saifullah
Yuxin Wen
Tom Goldstein
AAML
85
55
0
21 Feb 2024
Exploring ChatGPT and its Impact on Society
Md. Asraful Haque
Shuai Li
SILM
112
29
0
21 Feb 2024
SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization
Prakamya Mishra
Zonghai Yao
Parth Vashisht
Feiyun Ouyang
Beining Wang
Vidhi Mody
Hong-ye Yu
SyDa
MedIm
87
5
0
21 Feb 2024
Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models
Chenyang Lyu
Minghao Wu
Alham Fikri Aji
ELM
66
14
0
21 Feb 2024
Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs
Xiaoxia Li
Siyuan Liang
Jiyi Zhang
Hansheng Fang
Aishan Liu
Ee-Chien Chang
153
28
0
21 Feb 2024
Music Style Transfer with Time-Varying Inversion of Diffusion Models
Sifei Li
Yuxin Zhang
Fan Tang
Chongyang Ma
Weiming Dong
Changsheng Xu
DiffM
72
11
0
21 Feb 2024
An Evaluation of Large Language Models in Bioinformatics Research
Hengchuang Yin
Zhonghui Gu
Fanhao Wang
Yiparemu Abuduhaibaier
Yanqiao Zhu
Xinming Tu
Xian-Sheng Hua
Xiao Luo
Yizhou Sun
LM&MA
80
8
0
21 Feb 2024
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
Zhaorui Yang
Tianyu Pang
Hao Feng
Han Wang
Wei Chen
Minfeng Zhu
Qian Liu
ALM
99
50
0
21 Feb 2024
Privacy-Preserving Instructions for Aligning Large Language Models
Da Yu
Peter Kairouz
Sewoong Oh
Zheng Xu
120
25
0
21 Feb 2024
KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge
Jiyoung Lee
Minwoo Kim
Seungho Kim
Junghwan Kim
Seunghyun Won
Hwaran Lee
Edward Choi
ALM
127
17
0
21 Feb 2024
APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models
Ziyi Guan
Hantao Huang
Yupeng Su
Hong Huang
Ngai Wong
Hao Yu
MQ
89
16
0
21 Feb 2024
WinoViz: Probing Visual Properties of Objects Under Different States
Woojeong Jin
Tejas Srinivasan
Jesse Thomason
Xiang Ren
87
1
0
21 Feb 2024
Previous
1
2
3
...
94
95
96
...
126
127
128
Next