Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,392 papers shown
Title
Learning from Failures in Multi-Attempt Reinforcement Learning
Stephen Chung
Wenyu Du
Jie Fu
LRM
92
3
0
04 Mar 2025
LLM-Safety Evaluations Lack Robustness
Tim Beyer
Sophie Xhonneux
Simon Geisler
Gauthier Gidel
Leo Schwinn
Stephan Günnemann
ALM
ELM
488
2
0
04 Mar 2025
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
Yuzhe Gu
Wentao Zhang
Chengqi Lyu
Dahua Lin
Kai Chen
128
2
0
04 Mar 2025
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation
Songming Zhang
Xue Zhang
Tong Zhang
Bojie Hu
Yufeng Chen
Jinan Xu
125
1
0
04 Mar 2025
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models
Joykirat Singh
Tanmoy Chakraborty
A. Nambi
AI4Cl
LRM
ReLM
94
1
0
04 Mar 2025
ATLaS: Agent Tuning via Learning Critical Steps
Zhixun Chen
Ming Li
Yuanmin Huang
Yali Du
Meng Fang
Dinesh Manocha
201
5
0
04 Mar 2025
Enhancing Non-English Capabilities of English-Centric Large Language Models through Deep Supervision Fine-Tuning
Wenshuai Huo
Xiaocheng Feng
Yichong Huang
Chengpeng Fu
Baohang Li
...
Dandan Tu
Duyu Tang
Yunfei Lu
Hui Wang
Bing Qin
102
4
0
03 Mar 2025
Cancer Type, Stage and Prognosis Assessment from Pathology Reports using LLMs
Rachit Saluja
Jacob Rosenthal
Yoav Artzi
David J. Pisapia
B. Liechty
M. Sabuncu
LM&MA
ELM
140
1
0
03 Mar 2025
Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh
Fajri Koto
Rituraj Joshi
Nurdaulet Mukhituly
Yanjie Wang
Zhuohan Xie
...
Avraham Sheinin
Natalia Vassilieva
Neha Sengupta
Larry Murray
Preslav Nakov
ALM
KELM
136
0
0
03 Mar 2025
What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret
Yufeng Yuan
Yu Yue
Ruofei Zhu
Tiantian Fan
Lin Yan
OffRL
114
21
0
03 Mar 2025
Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models
Cheng-Kuang Wu
Zhi Rui Tam
Chieh-Yen Lin
Yun-Nung Chen
Hung-yi Lee
82
0
0
03 Mar 2025
Do GFlowNets Transfer? Case Study on the Game of 24/42
Adesh Gupta
Abhinav Kumar
Mansi Gupta
Paras Chopra
158
0
0
03 Mar 2025
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom
Yisen Li
Lingfeng Yang
Wenxuan Shen
Pan Zhou
Yao Wan
Weiwei Lin
Benlin Liu
113
1
0
03 Mar 2025
Parameter-Efficient Fine-Tuning of Large Language Models via Deconvolution in Subspace
Jia-Chen Zhang
Yu-Jie Xiong
Chun-Ming Xia
Dong-Hai Zhu
Xi-He Qiu
113
4
0
03 Mar 2025
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng
Yongxin Chen
Huayu Chen
Guande He
Xuan Li
Jun Zhu
Qinsheng Zhang
DiffM
157
3
0
03 Mar 2025
SRAG: Structured Retrieval-Augmented Generation for Multi-Entity Question Answering over Wikipedia Graph
Teng Lin
Yizhang Zhu
Yuyu Luo
Nan Tang
RALM
3DV
92
1
0
03 Mar 2025
All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Gokul Swamy
Sanjiban Choudhury
Wen Sun
Zhiwei Steven Wu
J. Andrew Bagnell
OffRL
142
20
0
03 Mar 2025
M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality
Ziyan Wang
Zhicheng Zhang
Fei Fang
Yali Du
123
3
0
03 Mar 2025
Active Learning for Direct Preference Optimization
Branislav Kveton
Xintong Li
Julian McAuley
Ryan Rossi
Jingbo Shang
Junda Wu
Tong Yu
118
1
0
03 Mar 2025
In-context Learning vs. Instruction Tuning: The Case of Small and Multilingual Language Models
David Ponce
Thierry Etchegoyhen
172
1
0
03 Mar 2025
DPR: Diffusion Preference-based Reward for Offline Reinforcement Learning
Teng Pang
Bingzheng Wang
Guoqiang Wu
Yilong Yin
OffRL
141
0
0
03 Mar 2025
Visual-RFT: Visual Reinforcement Fine-Tuning
Ziyu Liu
Zeyi Sun
Yuhang Zang
Xiaoyi Dong
Yuhang Cao
Haodong Duan
Dahua Lin
Jiaqi Wang
ObjD
VLM
LRM
165
129
0
03 Mar 2025
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
Zhendong Wang
Jianmin Bao
Shuyang Gu
Dong Chen
Wengang Zhou
Haoyang Li
DiffM
93
3
0
03 Mar 2025
ClipGrader: Leveraging Vision-Language Models for Robust Label Quality Assessment in Object Detection
Hong Lu
Yali Bian
Rahul C. Shah
ObjD
VLM
131
0
0
03 Mar 2025
Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models
Alberto Purpura
Sahil Wadhwa
Jesse Zymet
Akshay Gupta
Andy Luo
Melissa Kazemi Rad
Swapnil Shinde
Mohammad Sorower
AAML
472
0
0
03 Mar 2025
PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation
Yuxuan Liu
105
0
0
03 Mar 2025
None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering
Zhi Rui Tam
Cheng-Kuang Wu
Chieh-Yen Lin
Yun-Nung Chen
106
2
0
03 Mar 2025
NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks In Open Domains
Wonje Choi
Jinwoo Park
Sanghyun Ahn
Daehee Lee
Honguk Woo
447
1
0
02 Mar 2025
Cyber for AI at SemEval-2025 Task 4: Forgotten but Not Lost: The Balancing Act of Selective Unlearning in Large Language Models
Dinesh Srivasthav P
Bala Mallikarjunarao Garlapati
MU
74
0
0
02 Mar 2025
Instruct-of-Reflection: Enhancing Large Language Models Iterative Reflection Capabilities via Dynamic-Meta Instruction
Liping Liu
Chunhong Zhang
Likang Wu
Chuang Zhao
Zheng Hu
Ming He
Jianping Fan
LLMAG
LRM
75
2
0
02 Mar 2025
Output Length Effect on DeepSeek-R1's Safety in Forced Thinking
Xuying Li
Zhuo Li
Yuji Kosuga
Victor Bian
AAML
LRM
108
4
0
02 Mar 2025
PABBO: Preferential Amortized Black-Box Optimization
Xinyu Zhang
Daolang Huang
Samuel Kaski
Julien Martinelli
89
1
0
02 Mar 2025
Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models
Dilxat Muhtar
Enzhuo Zhang
Zhenshi Li
Feng-Xue Gu
Yanglangxing He
Pengfeng Xiao
Xueliang Zhang
102
3
0
02 Mar 2025
More of the Same: Persistent Representational Harms Under Increased Representation
Jennifer Mickel
Maria De-Arteaga
Leqi Liu
Kevin Tian
78
1
0
01 Mar 2025
Distributionally Robust Reinforcement Learning with Human Feedback
Debmalya Mandal
Paulius Sasnauskas
Goran Radanović
108
3
0
01 Mar 2025
CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering
Tianyu Huai
Jie Zhou
Xingjiao Wu
Qin Chen
Qingchun Bai
Ze Zhou
Liang He
MoE
124
4
0
01 Mar 2025
AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models
Sohan Patnaik
Rishabh Jain
Balaji Krishnamurthy
Mausoom Sarkar
89
0
0
01 Mar 2025
Sentence-level Reward Model can Generalize Better for Aligning LLM from Human Preference
Wenjie Qiu
Yi-Chen Li
Xuqin Zhang
Tianyi Zhang
Yiming Zhang
Zongzhang Zhang
Yang Yu
ALM
113
1
0
01 Mar 2025
Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning
Tianci Liu
R. Li
Yunzhe Qi
Hui Liu
Xianfeng Tang
...
Qingyu Yin
Monica Cheng
Jun Huan
Haoyu Wang
Jing Gao
KELM
100
4
0
01 Mar 2025
PodAgent: A Comprehensive Framework for Podcast Generation
Yujia Xiao
Lei He
Haohan Guo
Fenglong Xie
Tan Lee
433
1
0
01 Mar 2025
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Wei Suo
Lijun Zhang
Mengyang Sun
Lin Yuanbo Wu
Peng Wang
Yize Zhang
MLLM
VLM
115
3
0
01 Mar 2025
Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable
Tiansheng Huang
Sihao Hu
Fatih Ilhan
Selim Furkan Tekin
Zachary Yahn
Yichang Xu
Ling Liu
137
22
0
01 Mar 2025
Robust Multi-Objective Preference Alignment with Online DPO
Raghav Gupta
Ryan Sullivan
Yunxuan Li
Samrat Phatale
Abhinav Rastogi
69
1
0
01 Mar 2025
Efficient Jailbreaking of Large Models by Freeze Training: Lower Layers Exhibit Greater Sensitivity to Harmful Content
Hongyuan Shen
Min Zheng
Jincheng Wang
Yang Zhao
84
0
0
28 Feb 2025
Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching
Shi Meng
Bin Tian
Xiaotong Zhang
OffRL
55
0
0
28 Feb 2025
Llamarine: Open-source Maritime Industry-specific Large Language Model
William Nguyen
An Phan
Konobu Kimura
Hitoshi Maeno
Mika Tanaka
Quynh Le
William Poucher
Christopher Nguyen
LRM
77
0
0
28 Feb 2025
WorldModelBench: Judging Video Generation Models As World Models
Dacheng Li
Yunhao Fang
Yukang Chen
Shuo Yang
Shiyi Cao
...
Hongxu Yin
Joseph E. Gonzalez
Ion Stoica
Enze Xie
Yaojie Lu
VGen
110
7
0
28 Feb 2025
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Zhenyi Shen
Hanqi Yan
Linhai Zhang
Zhanghao Hu
Yali Du
Yulan He
LRM
179
27
0
28 Feb 2025
Dynamically Local-Enhancement Planner for Large-Scale Autonomous Driving
Nanshan Deng
Weitao Zhou
Bo Zhang
Junze Wen
Kun Jiang
Zhong Cao
Ke Wang
57
0
0
28 Feb 2025
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
156
2
0
28 Feb 2025
Previous
1
2
3
...
25
26
27
...
126
127
128
Next