Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,388 papers shown
Title
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks
Minghao Wu
Weixuan Wang
Sinuo Liu
Huifeng Yin
Xintong Wang
Yu Zhao
Chenyang Lyu
Longyue Wang
Weihua Luo
Kaifu Zhang
ELM
156
5
0
22 Apr 2025
Generative AI for Research Data Processing: Lessons Learnt From Three Use Cases
Modhurita Mitra
Martine G. de Vos
Nicola Cortinovis
Dawa Ometto
75
0
0
22 Apr 2025
Dynamic Early Exit in Reasoning Models
Chenxu Yang
Qingyi Si
Yongjie Duan
Zheliang Zhu
Chenyu Zhu
Zheng Lin
Zheng Lin
Li Cao
Weiping Wang
ReLM
LRM
189
22
0
22 Apr 2025
Instruction-Tuning Data Synthesis from Scratch via Web Reconstruction
Yuxin Jiang
Yijiao Wang
Chuhan Wu
Xinyi Dai
Yan Xu
...
Yucheng Wang
Xin Jiang
Lifeng Shang
Ruiming Tang
Wenjie Wang
142
0
0
22 Apr 2025
MARFT: Multi-Agent Reinforcement Fine-Tuning
Junwei Liao
Muning Wen
Jun Wang
Weinan Zhang
OffRL
167
5
0
21 Apr 2025
Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions
Saffron Huang
Esin Durmus
Miles McCain
Kunal Handa
Alex Tamkin
Jerry Hong
Michael Stern
Arushi Somani
Xiuruo Zhang
Deep Ganguli
VLM
119
6
0
21 Apr 2025
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Yilun Zhou
Austin Xu
Peifeng Wang
Caiming Xiong
Shafiq Joty
ELM
ALM
LRM
174
5
0
21 Apr 2025
DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution
Miaomiao Cai
Simiao Li
Wei Li
X. Y. Huang
Hanting Chen
Jie Hu
Yunhe Wang
77
1
0
21 Apr 2025
In-context Ranking Preference Optimization
Junda Wu
Rohan Surana
Zhouhang Xie
Yiran Shen
Yu Xia
Tong Yu
Ryan Rossi
Prithviraj Ammanabrolu
Julian McAuley
97
0
0
21 Apr 2025
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning
Jie Cheng
Ruixi Qiao
Lijun Li
Chao Guo
Jianmin Wang
Gang Xiong
Yisheng Lv
Fei-Yue Wang
LRM
467
5
0
21 Apr 2025
Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs
Marina Sakharova
Abhinav Anand
Mira Mezini
139
0
0
21 Apr 2025
Establishing Reliability Metrics for Reward Models in Large Language Models
Yizhou Chen
Yawen Liu
Xuesi Wang
Qingtao Yu
Guangda Huzhang
Anxiang Zeng
Han Yu
Zhiming Zhou
88
0
0
21 Apr 2025
LoRe: Personalizing LLMs via Low-Rank Reward Modeling
Avinandan Bose
Zhihan Xiong
Yuejie Chi
Simon S. Du
Lin Xiao
Maryam Fazel
86
2
0
20 Apr 2025
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
Liang Peng
Boxi Wu
Haoran Cheng
Yibo Zhao
Xiaofei He
63
0
0
20 Apr 2025
Reinforcement Learning from Multi-level and Episodic Human Feedback
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
96
0
0
20 Apr 2025
An LLM-enabled Multi-Agent Autonomous Mechatronics Design Framework
Zeyu Wang
Frank P.-W. Lo
Qian Chen
Yongqi Zhang
Chen Lin
Xu Chen
Zhenhua Yu
Alexander J. Thompson
Eric M. Yeatman
Benny Lo
AI4CE
75
1
0
20 Apr 2025
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines
Reya Vir
Shreya Shankar
Harrison Chase
Will Fu-Hinthorn
Aditya G. Parameswaran
AI4TS
87
0
0
20 Apr 2025
A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents
Yuting Huang
Leilei Ding
ZhiPeng Tang
Tianfu Wang
Xinrui Lin
Weinan Zhang
Mingxiao Ma
Yanyong Zhang
LLMAG
104
1
0
20 Apr 2025
Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey
Ahsan Bilal
Muhammad Ahmed Mohsin
Muhammad Umer
Muhammad Awais Khan Bangash
Muhammad Ali Jamshed
LLMAG
LRM
AI4CE
164
1
0
20 Apr 2025
Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation
Tuhina Tripathi
Manya Wadhwa
Greg Durrett
S. Niekum
78
0
0
20 Apr 2025
Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
Takuma Udagawa
Yang Zhao
H. Kanayama
Bishwaranjan Bhattacharjee
65
0
0
19 Apr 2025
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue
Zhiqi Chen
Rui Lu
Andrew Zhao
Zhaokai Wang
Yang Yue
Shiji Song
Gao Huang
ReLM
LRM
235
128
0
18 Apr 2025
Science Hierarchography: Hierarchical Organization of Science Literature
Muhan Gao
Jash Shah
Weiqi Wang
Daniel Khashabi
152
1
0
18 Apr 2025
LLM Sensitivity Evaluation Framework for Clinical Diagnosis
Chenwei Yan
Xiangling Fu
Yuxuan Xiong
Tianyi Wang
Siu Cheung Hui
Ji Wu
Xien Liu
LM&MA
ELM
82
2
0
18 Apr 2025
Analysing the Robustness of Vision-Language-Models to Common Corruptions
Muhammad Usama
Syeda Aishah Asim
Syed Bilal Ali
Syed Talal Wasim
Umair Bin Mansoor
VLM
93
0
0
18 Apr 2025
Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code
Antonio Della Porta
Stefano Lambiase
Fabio Palomba
57
0
0
18 Apr 2025
Remedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling
Shaomu Tan
Christof Monz
110
0
0
18 Apr 2025
Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge Reasoning
Jinqiao Wang
Jin Jiang
Yang Liu
Hao Fei
Xunliang Cai
LRM
87
0
0
18 Apr 2025
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
Yixuan Even Xu
Yash Savani
Fei Fang
Zico Kolter
OffRL
115
12
0
18 Apr 2025
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Xiaotian Zhang
Ruizhe Chen
Yang Feng
Zuozhu Liu
111
2
0
17 Apr 2025
SMPL-GPTexture: Dual-View 3D Human Texture Estimation using Text-to-Image Generation Models
Mingxiao Tu
Shuchang Ye
Hoijoon Jung
Jinman Kim
DiffM
56
0
0
17 Apr 2025
Governance Challenges in Reinforcement Learning from Human Feedback: Evaluator Rationality and Reinforcement Stability
Dana Alsagheer
Abdulrahman Kamal
Mohammad Kamal
W. Shi
ALM
47
0
0
17 Apr 2025
VLMGuard-R1: Proactive Safety Alignment for VLMs via Reasoning-Driven Prompt Optimization
Menglan Chen
Xianghe Pang
Jingjing Dong
Wenhao Wang
Yaxin Du
Siheng Chen
LRM
121
0
0
17 Apr 2025
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
João Loula
Benjamin LeBrun
Li Du
Ben Lipkin
Clemente Pasti
...
Ryan Cotterel
Vikash K. Mansinghka
Alexander K. Lew
Tim Vieira
Timothy J. O'Donnell
167
8
0
17 Apr 2025
MAIN: Mutual Alignment Is Necessary for instruction tuning
Fanyi Yang
Jianfeng Liu
Xinsong Zhang
Haoyu Liu
Xixin Cao
Yuefeng Zhan
H. Sun
Weiwei Deng
Feng Sun
Qi Zhang
ALM
62
0
0
17 Apr 2025
Data-efficient LLM Fine-tuning for Code Generation
Weijie Lv
X. Xia
Sheng-Jun Huang
ALM
SyDa
61
0
0
17 Apr 2025
Image-Editing Specialists: An RLAIF Approach for Diffusion Models
Elior Benarous
Yilun Du
Heng Yang
62
0
0
17 Apr 2025
Aligning Constraint Generation with Design Intent in Parametric CAD
Evan Casey
Tianyu Zhang
Shu Ishida
John Roger Thompson
Amir Hosein Khasahmadi
Joseph George Lambourne
P. Jayaraman
K. Willis
94
0
0
17 Apr 2025
GraphAttack: Exploiting Representational Blindspots in LLM Safety Mechanisms
Sinan He
An Wang
63
0
0
17 Apr 2025
EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery
Wei Zhang
Miaoxin Cai
Yaqian Ning
Tianze Zhang
Yin Zhuang
He Chen
Jun Li
Xuerui Mao
105
0
0
17 Apr 2025
Science-T2I: Addressing Scientific Illusions in Image Synthesis
Jialuo Li
Wenhao Chai
Xingyu Fu
Haiyang Xu
Saining Xie
MedIm
82
1
0
17 Apr 2025
Design Topological Materials by Reinforcement Fine-Tuned Generative Model
Haosheng Xu
Dongheng Qian
Zhixuan Liu
Yadong Jiang
Jing Wang
58
1
0
17 Apr 2025
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
Xiangyan Liu
Jinjie Ni
Zijian Wu
Chao Du
Longxu Dou
Haoran Wang
Tianyu Pang
Michael Shieh
OffRL
LRM
490
16
0
17 Apr 2025
Energy-Based Reward Models for Robust Language Model Alignment
Anamika Lochab
Ruqi Zhang
431
0
0
17 Apr 2025
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Siyan Zhao
Devaansh Gupta
Qinqing Zheng
Aditya Grover
DiffM
LRM
AI4CE
169
9
0
16 Apr 2025
Evaluating the Diversity and Quality of LLM Generated Content
Alexander Shypula
Shuo Li
Botong Zhang
Vishakh Padmakumar
Kayo Yin
Osbert Bastani
103
5
0
16 Apr 2025
Multilingual Contextualization of Large Language Models for Document-Level Machine Translation
Miguel Moura Ramos
Patrick Fernandes
Sweta Agrawal
André F.T. Martins
97
0
0
16 Apr 2025
An LLM-as-a-judge Approach for Scalable Gender-Neutral Translation Evaluation
Andrea Piergentili
Beatrice Savoldi
Matteo Negri
L. Bentivogli
ELM
77
1
0
16 Apr 2025
Can Pre-training Indicators Reliably Predict Fine-tuning Outcomes of LLMs?
Hansi Zeng
Kai Hui
Honglei Zhuang
Zhen Qin
Zhenrui Yue
Hamed Zamani
Dana Alon
63
0
0
16 Apr 2025
REWARD CONSISTENCY: Improving Multi-Objective Alignment from a Data-Centric Perspective
Zhihao Xu
Yongqi Tong
Xin Zhang
Jun Zhou
Xiting Wang
78
0
0
15 Apr 2025
Previous
1
2
3
...
16
17
18
...
126
127
128
Next