Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,372 papers shown
Title
BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
Lin Gui
Cristina Garbacea
Victor Veitch
BDL
LM&MA
112
49
0
02 Jun 2024
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Fabian Falck
Ziyu Wang
Chris Holmes
151
20
0
02 Jun 2024
Automatic Instruction Evolving for Large Language Models
Weihao Zeng
Can Xu
Yingxiu Zhao
Jianguang Lou
Weizhu Chen
SyDa
139
11
0
02 Jun 2024
Brainstorming Brings Power to Large Language Models of Knowledge Reasoning
Zining Qin
Chenhao Wang
Huiling Qin
Weijia Jia
LRM
73
1
0
02 Jun 2024
Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction
Xiaoyuan Li
Wenjie Wang
Moxin Li
Junrong Guo
Yang Zhang
Fuli Feng
ELM
LRM
91
20
0
02 Jun 2024
Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models
Wenqiang Sun
Zhengyi Wang
Shuo Chen
Yikai Wang
Zilong Chen
Jun Zhu
Jun Zhang
109
1
0
02 Jun 2024
Comprehensive Evaluation of Large Language Models for Topic Modeling
T. Doi
Masaru Isonuma
Hitomi Yanaka
ELM
63
1
0
02 Jun 2024
Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback
Chen Chen
Yuchen Hu
Wen Wu
Helin Wang
Chng Eng Siong
Chao Zhang
93
12
0
02 Jun 2024
LLMs Could Autonomously Learn Without External Supervision
Ke Ji
Junying Chen
Anningzhe Gao
Wenya Xie
Xiang Wan
Benyou Wang
86
4
0
02 Jun 2024
LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models
Liang Zhao
Tianwen Wei
Liang Zeng
Cheng Cheng
Liu Yang
...
Yimeng Gan
Rui Hu
Shuicheng Yan
Han Fang
Yahui Zhou
LLMAG
SyDa
121
11
0
02 Jun 2024
Inverse Constitutional AI: Compressing Preferences into Principles
Arduin Findeis
Timo Kaufmann
Eyke Hüllermeier
Samuel Albanie
Robert Mullins
SyDa
120
12
0
02 Jun 2024
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Tianci Liu
Haoyu Wang
Shiyang Wang
Yu Cheng
Jing Gao
ALM
81
1
0
01 Jun 2024
Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning
Keqi Deng
Guangzhi Sun
Phil Woodland
VLM
67
4
0
01 Jun 2024
A Survey on Large Language Models for Code Generation
Juyong Jiang
Fan Wang
Jiasi Shen
Sungju Kim
Sunghun Kim
152
204
0
01 Jun 2024
On Overcoming Miscalibrated Conversational Priors in LLM-based Chatbots
Christine Herlihy
Jennifer Neville
Tobias Schnabel
Adith Swaminathan
98
4
0
01 Jun 2024
Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach
Sambaran Bandyopadhyay
Himanshu Maheshwari
Anandhavelu Natarajan
Apoorv Saxena
75
8
0
01 Jun 2024
Beyond Metrics: Evaluating LLMs' Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios
Millicent Ochieng
Varun Gumma
Sunayana Sitaram
Jindong Wang
Vishrav Chaudhary
K. Ronen
Kalika Bali
Jacki OÑeill
61
4
0
01 Jun 2024
Phased Instruction Fine-Tuning for Large Language Models
Wei Pang
Chuan Zhou
Xiao-Hua Zhou
Xiaojie Wang
ALM
83
5
0
01 Jun 2024
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
Maximillian Chen
Ruoxi Sun
Sercan O. Arik
Tomas Pfister
LLMAG
109
11
0
31 May 2024
Query2CAD: Generating CAD models using natural language queries
Akshay Badagabettu
Sai Sravan Yarlagadda
A. Farimani
81
15
0
31 May 2024
Code Pretraining Improves Entity Tracking Abilities of Language Models
Najoung Kim
Sebastian Schuster
Shubham Toshniwal
75
14
0
31 May 2024
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Tengyang Xie
Dylan J. Foster
Akshay Krishnamurthy
Corby Rosset
Ahmed Hassan Awadallah
Alexander Rakhlin
100
45
0
31 May 2024
Direct Alignment of Language Models via Quality-Aware Self-Refinement
Runsheng Yu
Yong Wang
Xiaoqi Jiao
Youzhi Zhang
James T. Kwok
90
6
0
31 May 2024
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
Xiaojun Jia
Tianyu Pang
Chao Du
Yihao Huang
Jindong Gu
Yang Liu
Xiaochun Cao
Min Lin
AAML
104
41
0
31 May 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Haotian Hu
Yiqin Yang
Jianing Ye
Chengjie Wu
Ziqing Mai
Yujing Hu
Tangjie Lv
Changjie Fan
Qianchuan Zhao
Chongjie Zhang
OffRL
OnRL
78
3
0
31 May 2024
Improving Reward Models with Synthetic Critiques
Zihuiwen Ye
Fraser Greenlee-Scott
Max Bartolo
Phil Blunsom
Jon Ander Campos
Matthias Gallé
ALM
SyDa
LRM
103
24
0
31 May 2024
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment
Yueqin Yin
Zhendong Wang
Yujia Xie
Weizhu Chen
Mingyuan Zhou
95
4
0
31 May 2024
Ovis: Structural Embedding Alignment for Multimodal Large Language Model
Shiyin Lu
Yang Li
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
Han-Jia Ye
VLM
MLLM
141
55
0
31 May 2024
Intersectional Unfairness Discovery
Gezheng Xu
Qi Chen
Charles Ling
Boyu Wang
Changjian Shui
73
3
0
31 May 2024
It is Simple Sometimes: A Study On Improving Aspect-Based Sentiment Analysis Performance
Laura Cabello
Uchenna Akujuobi
73
1
0
31 May 2024
Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement
Pengwei Zhan
Zhen Xu
Qian Tan
Jie Song
Ru Xie
81
7
0
31 May 2024
GAMedX: Generative AI-based Medical Entity Data Extractor Using Large Language Models
Mohammed-Khalil Ghali
Abdelrahman Farrag
Hajar Sakai
Hicham El Baz
Yu Jin
Sarah Lam
LM&MA
MedIm
83
9
0
31 May 2024
OR-Bench: An Over-Refusal Benchmark for Large Language Models
Justin Cui
Wei-Lin Chiang
Ion Stoica
Cho-Jui Hsieh
ALM
163
55
0
31 May 2024
Standards for Belief Representations in LLMs
Daniel A. Herrmann
B. Levinstein
99
11
0
31 May 2024
SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents
Ethan Rathbun
Christopher Amato
Alina Oprea
OffRL
AAML
76
6
0
30 May 2024
Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning
Xinlu Zhang
Zhi Chen
Xi Ye
Xianjun Yang
Lichang Chen
William Y. Wang
Linda R. Petzold
LRM
132
15
0
30 May 2024
Xwin-LM: Strong and Scalable Alignment Practice for LLMs
Bolin Ni
Jingcheng Hu
Yixuan Wei
Houwen Peng
Zheng Zhang
Gaofeng Meng
Han Hu
LM&MA
ALM
71
3
0
30 May 2024
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation
Guillaume Huguet
James Vuckovic
Kilian Fatras
Eric Thibodeau-Laufer
Pablo Lemos
...
Jarrid Rector-Brooks
Tara Akhound-Sadegh
Michael M. Bronstein
Alexander Tong
A. Bose
103
32
0
30 May 2024
Large Language Models Can Self-Improve At Web Agent Tasks
Ajay Patel
M. Hofmarcher
Claudiu Leoveanu-Condrei
Marius-Constantin Dinu
Chris Callison-Burch
Sepp Hochreiter
LLMAG
120
31
0
30 May 2024
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Himangi Mittal
Nakul Agarwal
Shao-Yuan Lo
Kwonjoon Lee
119
18
0
30 May 2024
Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh
Yifan Hu
Iason Chaimalas
Viraj Mehta
Pier Giuseppe Sessa
Haitham Bou-Ammar
Ilija Bogunovic
96
39
0
30 May 2024
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Chen Zhang
Chengguang Tang
Dading Chong
Ke Shi
Guohua Tang
Feng Jiang
Haizhou Li
73
4
0
30 May 2024
TAIA: Large Language Models are Out-of-Distribution Data Learners
Shuyang Jiang
Yusheng Liao
Ya Zhang
Yu Wang
Yanfeng Wang
77
5
0
30 May 2024
InstructionCP: A fast approach to transfer Large Language Models into target language
Kuang-Ming Chen
Hung-yi Lee
CLL
81
3
0
30 May 2024
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
Jingchang Chen
Hongxuan Tang
Zheng Chu
Qianglong Chen
Zekun Wang
Ming Liu
Bing Qin
127
6
0
30 May 2024
NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Kai Wu
Boyuan Jiang
Zhengkai Jiang
Qingdong He
Donghao Luo
Shengzhi Wang
Qingwen Liu
Chengjie Wang
VLM
MLLM
115
4
0
30 May 2024
Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads
Avelina Asada Hadji-Kyriacou
Ognjen Arandjelović
37
1
0
30 May 2024
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Chaochen Gao
Xing Wu
Qingfang Fu
Songlin Hu
SyDa
110
7
0
30 May 2024
Preference Alignment with Flow Matching
Minu Kim
Yongsik Lee
Sehyeok Kang
Jihwan Oh
Song Chong
Seyoung Yun
91
2
0
30 May 2024
Instruction-Guided Visual Masking
Jinliang Zheng
Jianxiong Li
Si Cheng
Yinan Zheng
Jiaming Li
Jihao Liu
Yu Liu
Jingjing Liu
Xianyuan Zhan
136
7
0
30 May 2024
Previous
1
2
3
...
71
72
73
...
126
127
128
Next