Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,384 papers shown
Title
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
Qibing Ren
Chang Gao
Jing Shao
Junchi Yan
Xin Tan
Wai Lam
Lizhuang Ma
ALM
ELM
AAML
120
26
0
12 Mar 2024
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Sainbayar Sukhbaatar
O. Yu. Golovneva
Vasu Sharma
Hu Xu
Xi Lin
...
Jacob Kahn
Shang-Wen Li
Wen-tau Yih
Jason Weston
Xian Li
MoMe
OffRL
MoE
98
69
0
12 Mar 2024
Beyond Memorization: The Challenge of Random Memory Access in Language Models
Tongyao Zhu
Qian Liu
Liang Pang
Zhengbao Jiang
Min-Yen Kan
Min Lin
KELM
91
6
0
12 Mar 2024
Fine-tuning Large Language Models with Sequential Instructions
Hanxu Hu
Simon Yu
Pinzhen Chen
Edoardo Ponti
ALM
LRM
137
15
0
12 Mar 2024
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models
Yan Liu
Renren Jin
Ling Shi
Zheng Yao
Deyi Xiong
LRM
74
5
0
12 Mar 2024
KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction
Zixuan Li
Yutao Zeng
Yuxin Zuo
Weicheng Ren
Wenxuan Liu
...
Yidan Liu
Pan Yang
Xiaolong Jin
Jiafeng Guo
Xueqi Cheng
OffRL
101
34
0
12 Mar 2024
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong
Noah Lee
James Thorne
OSLM
113
268
0
12 Mar 2024
Characterization of Large Language Model Development in the Datacenter
Qi Hu
Zhisheng Ye
Zerui Wang
Guoteng Wang
Mengdie Zhang
...
Dahua Lin
Xiaolin Wang
Yingwei Luo
Yonggang Wen
Tianwei Zhang
94
51
0
12 Mar 2024
generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation
Thilo Spinner
Rebecca Kehlbeck
Rita Sevastjanova
Tobias Stähle
Daniel A. Keim
Oliver Deussen
Mennatallah El-Assady
74
3
0
12 Mar 2024
RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model
Mingze Wang
Lili Su
Cilin Yan
Sheng Xu
Pengcheng Yuan
Xiaolong Jiang
Baochang Zhang
72
12
0
12 Mar 2024
SIFiD: Reassess Summary Factual Inconsistency Detection with LLM
Jiuding Yang
Hui Liu
Weidong Guo
Zhuwei Rao
Yu-Syuan Xu
Di Niu
HILM
84
0
0
12 Mar 2024
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
MLLM
VLM
138
23
0
12 Mar 2024
Matrix-Transformation Based Low-Rank Adaptation (MTLoRA): A Brain-Inspired Method for Parameter-Efficient Fine-Tuning
Yao Liang
Yuwei Wang
Yang Li
Yi Zeng
100
2
0
12 Mar 2024
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models
Yu Yang
Siddhartha Mishra
Jeffrey N Chiang
Baharan Mirzasoleiman
101
24
0
12 Mar 2024
Towards Faithful Explanations: Boosting Rationalization with Shortcuts Discovery
Linan Yue
Qi Liu
Yichao Du
Li Wang
Weibo Gao
Yanqing An
82
5
0
12 Mar 2024
Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
Yang Jiao
Shaoxiang Chen
Zequn Jie
Wenke Huang
Lin Ma
Yueping Jiang
MLLM
88
20
0
12 Mar 2024
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
Pulkit Pattnaik
Rishabh Maheshwary
Kelechi Ogueji
Vikas Yadav
Sathwik Tejaswi Madhusudhan
75
22
0
12 Mar 2024
(
N
,
K
)
\mathbf{(N,K)}
(
N
,
K
)
-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Yufeng Zhang
Liyu Chen
Boyi Liu
Yingxiang Yang
Qiwen Cui
Yunzhe Tao
Hongxia Yang
227
0
0
11 Mar 2024
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
115
81
0
11 Mar 2024
Materials science in the era of large language models: a perspective
Ge Lei
Ronan Docherty
Samuel J. Cooper
86
18
0
11 Mar 2024
RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback
Yanming Liu
Xinyue Peng
Xuhong Zhang
Weihao Liu
Jianwei Yin
Jiannan Cao
Tianyu Du
RALM
71
45
0
11 Mar 2024
ALaRM: Align Language Models via Hierarchical Rewards Modeling
Yuhang Lai
Siyuan Wang
Shujun Liu
Xuanjing Huang
Zhongyu Wei
89
5
0
11 Mar 2024
Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning
Zijian Zhou
Miaojing Shi
Meng Wei
Oluwatosin O. Alabi
Zijie Yue
Tom Vercauteren
LM&MA
84
7
0
11 Mar 2024
Elephants Never Forget: Testing Language Models for Memorization of Tabular Data
Sebastian Bordt
Harsha Nori
Rich Caruana
LMTD
100
19
0
11 Mar 2024
RLingua: Improving Reinforcement Learning Sample Efficiency in Robotic Manipulations With Large Language Models
Liangliang Chen
Yutian Lei
Shiyu Jin
Ying Zhang
Liangjun Zhang
LM&Ro
109
12
0
11 Mar 2024
A Logical Pattern Memory Pre-trained Model for Entailment Tree Generation
Li Yuan
Yi Cai
Haopeng Ren
Jiexin Wang
LRM
69
5
0
11 Mar 2024
A Knowledge-Injected Curriculum Pretraining Framework for Question Answering
Xin Lin
Tianhuang Su
Zhenya Huang
Shangzi Xue
Haifeng Liu
Enhong Chen
69
2
0
11 Mar 2024
Amharic LLaMA and LLaVA: Multimodal LLMs for Low Resource Languages
Michael Andersland
33
0
0
11 Mar 2024
From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification
Fei Wang
Chao Shang
Sarthak Jain
Shuai Wang
Qiang Ning
Bonan Min
Vittorio Castelli
Yassine Benajiba
Dan Roth
ALM
55
8
0
10 Mar 2024
FARPLS: A Feature-Augmented Robot Trajectory Preference Labeling System to Assist Human Labelers' Preference Elicitation
Hanfang Lyu
Yuanchen Bai
Xin Liang
Ujaan Das
Chuhan Shi
Leiliang Gong
Yingchi Li
Mingfei Sun
Ming Ge
Xiaojuan Ma
82
0
0
10 Mar 2024
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Ruiwen Zhou
Yingxuan Yang
Kangrui Chen
Ying Wen
Wenhao Wang
Chunling Xi
Guoqiang Xu
Jiliang Tang
Lingjuan Lyu
LLMAG
43
11
0
10 Mar 2024
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Minjie Zhu
Yichen Zhu
Xin Liu
Ning Liu
Zhiyuan Xu
Yaxin Peng
Chaomin Shen
Zhicai Ou
Feifei Feng
Jian Tang
VLM
102
22
0
10 Mar 2024
Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Swapnaja Achintalwar
Adriana Alvarado Garcia
Ateret Anaby-Tavor
Ioana Baldini
Sara E. Berger
...
Aashka Trivedi
Kush R. Varshney
Dennis L. Wei
Shalisha Witherspooon
Marcel Zalmanovici
94
11
0
09 Mar 2024
A Generalized Acquisition Function for Preference-based Reward Learning
Evan Ellis
Gaurav R. Ghosal
Stuart J. Russell
Anca Dragan
Erdem Biyik
65
2
0
09 Mar 2024
Reverse That Number! Decoding Order Matters in Arithmetic Learning
Daniel Zhang-Li
Nianyi Lin
Jifan Yu
Zheyuan Zhang
Zijun Yao
Yanling Wang
Lei Hou
Jing Zhang
Juanzi Li
75
4
0
09 Mar 2024
S
2
\textbf{S}^2
S
2
IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting
Zijie Pan
Yushan Jiang
Sahil Garg
Anderson Schneider
Yuriy Nevmyvaka
Dongjin Song
AI4TS
148
8
0
09 Mar 2024
Are Large Language Models Aligned with People's Social Intuitions for Human-Robot Interactions?
Lennart Wachowiak
Andrew Coles
Oya Celiktutan
Gerard Canal
79
0
0
08 Mar 2024
Concept-aware Data Construction Improves In-context Learning of Language Models
Michal Štefánik
Marek Kadlcík
Petr Sojka
94
1
0
08 Mar 2024
Bayesian Preference Elicitation with Language Models
Kunal Handa
Yarin Gal
Ellie Pavlick
Noah D. Goodman
Jacob Andreas
Alex Tamkin
Belinda Z. Li
77
16
0
08 Mar 2024
Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation
Yijiang Li
Sucheng Ren
Weipeng Deng
Yuzhi Xu
Ying Gao
Edith C.H. Ngai
Haohan Wang
OOD
99
2
0
08 Mar 2024
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang
Eric Wallace
Claire Tomlin
Aviral Kumar
Sergey Levine
HILM
LRM
109
58
0
08 Mar 2024
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
Zihao Wang
Hoang Trung-Dung
Haowei Lin
Jiaqi Li
Xiaojian Ma
Yitao Liang
ReLM
RALM
LRM
163
49
0
08 Mar 2024
Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering
Hongda Sun
Yuxuan Liu
Chengwei Wu
Haiyu Yan
Cheng Tai
Xin Gao
Shuo Shang
Rui Yan
92
11
0
08 Mar 2024
Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Xiaoying Zhang
Jean-François Ton
Wei Shen
Hongning Wang
Yang Liu
78
15
0
08 Mar 2024
On Protecting the Data Privacy of Large Language Models (LLMs): A Survey
Biwei Yan
Kun Li
Minghui Xu
Yueyan Dong
Yue Zhang
Zhaochun Ren
Xiuzhen Cheng
AILaw
PILM
154
88
0
08 Mar 2024
Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis
Mu-Hwa Chen
Yi Liu
Jian Yi
Changran Xu
Qiuxia Lai
Hongliang Wang
Tsung-Yi Ho
Qiang Xu
EGVM
84
10
0
08 Mar 2024
Benchmarking Large Language Models for Molecule Prediction Tasks
Zhiqiang Zhong
Kuangyu Zhou
Davide Mottin
76
10
0
08 Mar 2024
Aligning Large Language Models for Controllable Recommendations
Wensheng Lu
Jianxun Lian
Wei Zhang
Guanghua Li
Mingyang Zhou
Hao Liao
Xing Xie
ALM
97
16
0
08 Mar 2024
Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
Xuhui Zhou
Zhe Su
Tiwalayo Eisape
Hyunwoo J. Kim
Maarten Sap
80
41
0
08 Mar 2024
Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
Huiying Zhong
Zhun Deng
Weijie J. Su
Zhiwei Steven Wu
Linjun Zhang
79
18
0
08 Mar 2024
Previous
1
2
3
...
89
90
91
...
126
127
128
Next