Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,392 papers shown
Title
Coalitions of Large Language Models Increase the Robustness of AI Agents
Prattyush Mangal
Carol Mak
Theo Kanakis
Timothy Donovan
Dave Braines
Edward Pyzer-Knapp
58
1
0
02 Aug 2024
Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks
Anders Giovanni Moller
L. Aiello
LLMAG
53
4
0
02 Aug 2024
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only
He Zhu
Junyou Su
Tianle Lun
Yicheng Tao
Wenjia Zhang
Zipei Fan
Guanhua Chen
ALM
86
5
0
02 Aug 2024
TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation
Yicheng Lin
David Soto
Roberto Santana
60
1
0
02 Aug 2024
Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Jin Gao
Lei Gan
Yuankai Li
Yixin Ye
Dequan Wang
73
3
0
02 Aug 2024
Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts
Youna Kim
Sungmin Cho
Cheonbok Park
Choonghyun Park
Hyunsoo Cho
Junyeob Kim
Kang Min Yoo
Sang-goo Lee
Taeuk Kim
81
7
0
02 Aug 2024
LLM as Runtime Error Handler: A Promising Pathway to Adaptive Self-Healing of Software Systems
Zhensu Sun
Haotian Zhu
Bowen Xu
Xiaoning Du
Yizhe Zhu
David Lo
81
4
0
02 Aug 2024
A Safe Exploration Strategy for Model-free Task Adaptation in Safety-constrained Grid Environments
Erfan Entezami
Mahsa Sahebdel
Dhawal Gupta
88
0
0
02 Aug 2024
A Survey on Self-play Methods in Reinforcement Learning
Chao Yu
Zelai Xu
Chengdong Ma
Chao Yu
Weijuan Tu
...
Deheng Ye
Wenbo Ding
Yaodong Yang
Yu Wang
Yu Wang
SyDa
SSL
OnRL
185
9
0
02 Aug 2024
Hybrid Querying Over Relational Databases and Large Language Models
T. Pham
Cody T. Reynolds
A. El Abbadi
93
1
0
01 Aug 2024
Intermittent Semi-working Mask: A New Masking Paradigm for LLMs
Mingcong Lu
Jiangcai Zhu
Wang Hao
Zheng Li
Shusheng Zhang
Kailai Shao
Chao Chen
Nan Li
Feng Wang
Xin Lu
67
0
0
01 Aug 2024
GalleryGPT: Analyzing Paintings with Large Multimodal Models
Yi Bin
Wenhao Shi
Yujuan Ding
Zhiqiang Hu
Zheng Wang
Yang Yang
See-Kiong Ng
H. Shen
MLLM
96
11
0
01 Aug 2024
Towards Reliable Advertising Image Generation Using Human Feedback
Thorben Werner
Wei Feng
Haohan Wang
Yaoyu Li
Jingsen Wang
...
Maximilian Stubbemann
Junsheng Jin
Lars Schmidt-Thieme
Zhangang Lin
Jingping Shao
134
3
0
01 Aug 2024
What comes after transformers? -- A selective survey connecting ideas in deep learning
Johannes Schneider
AI4CE
125
2
0
01 Aug 2024
Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks
Jy-yong Sohn
Dohyun Kwon
Seoyeon An
Kangwook Lee
111
0
0
01 Aug 2024
ABC Align: Large Language Model Alignment for Safety & Accuracy
Gareth Seneque
Lap-Hang Ho
Peter W. Glynn
Yinyu Ye
Jeffrey Molendijk
92
1
0
01 Aug 2024
A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence
Mingyang Liu
Gabriele Farina
Asuman Ozdaglar
83
3
0
01 Aug 2024
Tamper-Resistant Safeguards for Open-Weight LLMs
Rishub Tamirisa
Bhrugu Bharathi
Long Phan
Andy Zhou
Alice Gatti
...
Andy Zou
Dawn Song
Bo Li
Dan Hendrycks
Mantas Mazeika
AAML
MU
135
63
0
01 Aug 2024
A new approach for encoding code and assisting code understanding
Mengdan Fan
Changde Du
Haiyan Zhao
Zhi Jin
157
0
0
01 Aug 2024
Finch: Prompt-guided Key-Value Cache Compression
Giulio Corallo
Paolo Papotti
124
3
0
31 Jul 2024
Automatic Generation of Behavioral Test Cases For Natural Language Processing Using Clustering and Prompting
Ying Li
Rahul Singh
Tarun Joshi
Agus Sudjianto
52
1
0
31 Jul 2024
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Richard Ren
Steven Basart
Adam Khoja
Alice Gatti
Long Phan
...
Alexander Pan
Gabriel Mukobi
Ryan H. Kim
Stephen Fitz
Dan Hendrycks
ELM
87
26
0
31 Jul 2024
Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Shiping Liu
Kecheng Zheng
Wei Chen
MLLM
116
53
0
31 Jul 2024
Social Learning through Interactions with Other Agents: A Survey
Dylan Hillier
Cheston Tan
Jing Jiang
109
2
0
31 Jul 2024
QuestGen: Effectiveness of Question Generation Methods for Fact-Checking Applications
Rivik Setty
Vinay Setty
101
4
0
31 Jul 2024
FTuner: A Fast Dynamic Shape Tensors Program Auto-Tuner for Deep Learning Compilers
Pengyu Mu
Linquan Wei
Yi Liu
Rui Wang
62
1
0
31 Jul 2024
Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models
Zhengxuan Wu
Yuhao Zhang
Linquan Wei
Yumo Xu
Rujun Han
Yi Liu
Jifan Chen
Bonan Min
Zhiheng Huang
95
0
0
31 Jul 2024
Towards interfacing large language models with ASR systems using confidence measures and prompting
Maryam Naderi
Xingrui Yang
Weihan Wang
Sevada Hovsepyan
Weichen Dai
KELM
62
1
0
31 Jul 2024
Big Cooperative Learning
Yulai Cong
AI4CE
70
0
0
31 Jul 2024
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment
Sangwon Yu
Jongyoon Song
Bongkyu Hwang
Hoyoung Kang
Sooah Cho
Junhwa Choi
Seongho Joe
Taehee Lee
Youngjune Gwon
Sungroh Yoon
233
6
0
31 Jul 2024
TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods
Gabriel Loiseau
Damien Sileo
Damien Riquet
Maxime Meyer
Marc Tommasi
72
0
0
31 Jul 2024
Decomposed Prompting to Answer Questions on a Course Discussion Board
Brandon Jaipersaud
Paul Zhang
Jimmy Ba
Andrew Petersen
Lisa Zhang
Michael Ruogu Zhang
54
3
0
30 Jul 2024
ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning
Hosung Lee
Sejin Kim
Seungpil Lee
Sanha Hwang
Jihwan Lee
Byung-Jun Lee
Sundong Kim
LRM
92
9
0
30 Jul 2024
Autonomous Improvement of Instruction Following Skills via Foundation Models
Zhiyuan Zhou
P. Atreya
Abraham Lee
Homer Walke
Oier Mees
Sergey Levine
95
14
0
30 Jul 2024
Fine-Tuned Large Language Model for Visualization System: A Study on Self-Regulated Learning in Education
Lin Gao
Jing Lu
Zekai Shao
Ziyue Lin
Shengbin Yue
Chio-in Ieong
Yi Sun
Rory James Zauner
Zhongyu Wei
Siming Chen
68
12
0
30 Jul 2024
Machine Unlearning in Generative AI: A Survey
Zheyuan Liu
Guangyao Dou
Zhaoxuan Tan
Yijun Tian
Meng Jiang
MU
109
19
0
30 Jul 2024
A federated large language model for long-term time series forecasting
Raed Abdel Sater
A. B. Hamza
AI4TS
44
5
0
30 Jul 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Zhihang Lin
Zhihang Lin
Yushun Zhang
Tian Ding
Ruoyu Sun
Ruoyu Sun
CLL
180
5
0
30 Jul 2024
CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models
Junda Wu
Xintong Li
Tong Yu
Yu Wang
Xiang Chen
Jiuxiang Gu
Lina Yao
Jingbo Shang
Julian McAuley
75
2
0
29 Jul 2024
Apple Intelligence Foundation Language Models
Tom Gunter
Zirui Wang
Chong-Jun Wang
Ruoming Pang
Andy Narayanan
...
Xinwen Liu
Yang Zhao
Yin Xia
Zhile Ren
Zhongzheng Ren
148
40
0
29 Jul 2024
Can Editing LLMs Inject Harm?
Canyu Chen
Baixiang Huang
Zekun Li
Zhaorun Chen
Shiyang Lai
...
Xifeng Yan
William Wang
Philip Torr
Dawn Song
Kai Shu
KELM
157
15
0
29 Jul 2024
An Energy-based Model for Word-level AutoCompletion in Computer-aided Translation
Cheng Yang
Guoping Huang
Mo Yu
Zhirui Zhang
Siheng Li
Mingming Yang
Shuming Shi
Yujiu Yang
Lemao Liu
120
1
0
29 Jul 2024
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Peng Sun
159
13
0
29 Jul 2024
Legal Minds, Algorithmic Decisions: How LLMs Apply Constitutional Principles in Complex Scenarios
Camilla Bignotti
C. Camassa
AILaw
ELM
76
2
0
29 Jul 2024
Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models
Somshubra Majumdar
Vahid Noroozi
Mehrzad Samadi
Sean Narenthiran
Aleksander Ficek
Wasi Uddin Ahmad
Jocelyn Huang
Jagadeesh Balam
Boris Ginsburg
SyDa
140
3
0
29 Jul 2024
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu
Weizhe Yuan
O. Yu. Golovneva
Jing Xu
Yuandong Tian
Jiantao Jiao
Jason Weston
Sainbayar Sukhbaatar
ALM
KELM
LRM
142
96
0
28 Jul 2024
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain
Pierre Colombo
T. Pires
Malik Boudiaf
Rui Melo
Dominic Culver
Sofia Morgado
Etienne Malaboeuf
Gabriel Hautreux
Johanne Charpentier
Michael Desa
ELM
AILaw
ALM
98
17
0
28 Jul 2024
Logic Distillation: Learning from Code Function by Function for Planning and Decision-making
Dong Chen
Shilin Zhang
Fei Gao
Yueting Zhuang
Siliang Tang
Qidong Liu
Mingliang Xu
LRM
45
1
0
28 Jul 2024
Polynomial Regression as a Task for Understanding In-context Learning Through Finetuning and Alignment
Max Wilcoxson
Morten Svendgård
Ria Doshi
Dylan Davis
Reya Vir
Anant Sahai
53
0
0
27 Jul 2024
GP-VLS: A general-purpose vision language model for surgery
Samuel Schmidgall
Joseph Cho
C. Zakka
W. Hiesinger
LM&MA
141
6
0
27 Jul 2024
Previous
1
2
3
...
56
57
58
...
126
127
128
Next