Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,370 papers shown
Title
KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG
Yongjian Li
HaoCheng Chu
Yukun Yan
Zhenghao Liu
S. Yu
Zheni Zeng
Ruobing Wang
Sen Song
Zhiyuan Liu
Maosong Sun
49
0
0
03 Jun 2025
A Trustworthiness-based Metaphysics of Artificial Intelligence Systems
Andrea Ferrario
40
0
0
03 Jun 2025
BNPO: Beta Normalization Policy Optimization
Changyi Xiao
Mengdi Zhang
Yixin Cao
OffRL
66
0
0
03 Jun 2025
EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing
Fan Gao
Dongyuan Li
Ding Xia
Fei Mi
Yasheng Wang
Lifeng Shang
Baojun Wang
ELM
42
0
0
03 Jun 2025
Minos: A Multimodal Evaluation Model for Bidirectional Generation Between Image and Text
Junzhe Zhang
Huixuan Zhang
Xinyu Hu
Li Lin
Mingqi Gao
Shi Qiu
Xiaojun Wan
MLLM
67
0
0
03 Jun 2025
One Missing Piece for Open-Source Reasoning Models: A Dataset to Mitigate Cold-Starting Short CoT LLMs in RL
Hyungjoo Chae
Dongjin Kang
J. Kim
Beong-woo Kwak
Sunghyun Park
Haeju Park
Jinyoung Yeo
M. Lee
Kyungjae Lee
ReLM
LRM
57
0
0
03 Jun 2025
FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models
Yan Gao
Massimo Roberto Scamarcia
Javier Fernandez-Marques
Mohammad Naseri
Chong Shen Ng
...
Junyan Wang
Zheyuan Liu
Daniel J. Beutel
Lingjuan Lyu
Nicholas D. Lane
ALM
69
1
0
03 Jun 2025
From Anger to Joy: How Nationality Personas Shape Emotion Attribution in Large Language Models
M. Kamruzzaman
Abdullah Al Monsur
Gene Louis Kim
Anshuman Chhabra
66
0
0
03 Jun 2025
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Yunhong Lu
Qichao Wang
H. Cao
Xiaoyin Xu
Min Zhang
53
0
0
03 Jun 2025
Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem
Yubo Wang
Ping Nie
Kai Zou
Lijun Wu
Wenhu Chen
OffRL
ReLM
LRM
26
0
0
03 Jun 2025
BitBypass: A New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage
Kalyan Nakka
Nitesh Saxena
54
0
0
03 Jun 2025
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
Shihan Dou
Ming Zhang
Chenhao Huang
Jiayi Chen
F. Chen
...
Wei Chengzhi
Lin Yan
Qi Zhang
Xuanjing Huang
Xuanjing Huang
ELM
88
0
0
03 Jun 2025
Should LLM Safety Be More Than Refusing Harmful Instructions?
Utsav Maskey
Mark Dras
Usman Naseem
70
0
0
03 Jun 2025
IndoSafety: Culturally Grounded Safety for LLMs in Indonesian Languages
Muhammad Falensi Azmi
Muhammad Dehan Al Kautsar
Alfan Farizki Wicaksono
Fajri Koto
54
0
0
03 Jun 2025
Beyond Text Compression: Evaluating Tokenizers Across Scales
Jonas F. Lotz
António V. Lopes
Stephan Peitz
Hendra Setiawan
Leonardo Emili
61
0
0
03 Jun 2025
EgoVLM: Policy Optimization for Egocentric Video Understanding
Ashwin Vinod
Shrey Pandit
Aditya Vavre
Linshen Liu
LRM
48
0
0
03 Jun 2025
Corrigibility as a Singular Target: A Vision for Inherently Reliable Foundation Models
Ram Potham
Max Harms
LRM
67
0
0
03 Jun 2025
Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation
Dingwei Chen
Ziqiang Liu
Feiteng Fang
Chak Tou Leong
Shiwen Ni
A. Argha
Hamid Alinejad-Rokny
Min Yang
Chengming Li
KELM
HILM
61
0
0
03 Jun 2025
DPO Learning with LLMs-Judge Signal for Computer Use Agents
Man Luo
David Cobbley
Xin Su
Shachar Rosenman
Vasudev Lal
Shao-Yen Tseng
Phillip Howard
51
0
0
03 Jun 2025
Understanding the Impact of Sampling Quality in Direct Preference Optimization
Kyung Rok Kim
Yumo Bai
Chonghuan Wang
Guanting Chen
22
0
0
03 Jun 2025
XToM: Exploring the Multilingual Theory of Mind for Large Language Models
Chunkit Chan
Yauwai Yim
Hongchuan Zeng
Zhiying Zou
Xinyuan Cheng
...
Ginny Wong
Helmut Schmid
Hinrich Schütze
Simon See
Yangqiu Song
LRM
69
0
0
03 Jun 2025
Provable Reinforcement Learning from Human Feedback with an Unknown Link Function
Qining Zhang
Lei Ying
75
0
0
03 Jun 2025
MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching
Liang Yue
Yihong Tang
Kehai Chen
Jie Liu
Min Zhang
LLMAG
65
0
0
03 Jun 2025
Rethinking Dynamic Networks and Heterogeneous Computing with Automatic Parallelization
Ruilong Wu
Xinjiao Li
Yisu Wang
Xinyu Chen
Dirk Kutscher
59
0
0
03 Jun 2025
Towards Human-like Preference Profiling in Sequential Recommendation
Z. Ouyang
Qianlong Wen
Chunhui Zhang
Yanfang Ye
Soroush Vosoughi
HAI
27
0
0
02 Jun 2025
Respond Beyond Language: A Benchmark for Video Generation in Response to Realistic User Intents
Shuting Wang
Yunqi Liu
Zixin Yang
Ning Hu
Zhicheng Dou
Chenyan Xiong
VGen
54
0
0
02 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
36
0
0
02 Jun 2025
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
S. Wang
Le Yu
Chang Gao
Chujie Zheng
Shixuan Liu
...
Yang Yue
S. Song
Bowen Yu
Gao Huang
Junyang Lin
LRM
72
9
0
02 Jun 2025
Stochastically Dominant Peer Prediction
Yichi Zhang
Shengwei Xu
David Pennock
Grant Schoenebeck
21
0
0
02 Jun 2025
Detoxification of Large Language Models through Output-layer Fusion with a Calibration Model
Yuanhe Tian
Mingjie Deng
Guoqing Jin
Yan Song
MU
KELM
59
0
0
02 Jun 2025
Fodor and Pylyshyn's Legacy - Still No Human-like Systematic Compositionality in Neural Networks
Tim Woydt
Moritz Willig
Antonia Wüst
Lukas Helff
Wolfgang Stammer
Constantin Rothkopf
Kristian Kersting
66
1
0
02 Jun 2025
Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner
Chunhui Zhang
Z. Ouyang
Kwonjoon Lee
Nakul Agarwal
Sean Dae Houlihan
Soroush Vosoughi
Shao-Yuan Lo
LRM
70
0
0
02 Jun 2025
Automatic Stage Lighting Control: Is it a Rule-Driven Process or Generative Task?
Zijian Zhao
Dian Jin
Zijing Zhou
Xiaoyu Zhang
41
0
0
02 Jun 2025
CoRE: Condition-based Reasoning for Identifying Outcome Variance in Complex Events
Sai Vallurupalli
Francis Ferraro
59
0
0
02 Jun 2025
Synthetic Data Augmentation using Pre-trained Diffusion Models for Long-tailed Food Image Classification
GaYeon Koh
Hyun-Jic Oh
Jeonghyun Noh
Won-Ki Jeong
DiffM
52
0
0
02 Jun 2025
A Descriptive and Normative Theory of Human Beliefs in RLHF
Sylee Dandekar
Shripad Deshmukh
Frank Chiu
W. B. Knox
S. Niekum
60
0
0
02 Jun 2025
TSRating: Rating Quality of Diverse Time Series Data by Meta-learning from LLM Judgment
Shunyu Wu
Dan Li
Haozheng Ye
Zhuomin Chen
Jiahui Zhou
Jian Lou
Zibin Zheng
See-Kiong Ng
AI4TS
49
0
0
02 Jun 2025
Incentivizing LLMs to Self-Verify Their Answers
Fuxiang Zhang
Jiacheng Xu
Chaojie Wang
Ce Cui
Yang Liu
Bo An
ReLM
LRM
61
0
0
02 Jun 2025
AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning
Zhong Zhang
Yaxi Lu
Yikun Fu
Yupeng Huo
Shenzhi Yang
...
Chongyi Wang
Chi Chen
Yuan Yao
Zhiyuan Liu
Maosong Sun
LLMAG
ALM
69
0
0
02 Jun 2025
IF-GUIDE: Influence Function-Guided Detoxification of LLMs
Zachary Coalson
Juhan Bae
Nicholas Carlini
Sanghyun Hong
TDI
86
0
0
02 Jun 2025
MINT: Multimodal Instruction Tuning with Multimodal Interaction Grouping
Xiaojun Shan
Qi Cao
Xing Han
Haofei Yu
Paul Liang
55
0
0
02 Jun 2025
Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges
Lajos Muzsai
David Imolai
András Lukács
LLMAG
LRM
23
0
0
01 Jun 2025
XGUARD: A Graded Benchmark for Evaluating Safety Failures of Large Language Models on Extremist Content
Vadivel Abishethvarman
Bhavik Chandna
Pratik Jalan
Usman Naseem
ELM
24
0
0
01 Jun 2025
Generalizable LLM Learning of Graph Synthetic Data with Reinforcement Learning
Yizhuo Zhang
Heng Wang
Shangbin Feng
Zhaoxuan Tan
Xinyun Liu
Yulia Tsvetkov
OffRL
72
0
0
01 Jun 2025
Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience
Jiawei Gu
Ziting Xian
Yuanzhen Xie
Ye Liu
Enjie Liu
Ruichao Zhong
Mochi Gao
Yunzhi Tan
Bo Hu
Zang Li
RALM
53
0
0
01 Jun 2025
Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-Judge
Md Tahmid Rahman Laskar
Israt Jahan
Elham Dolatabadi
Chun Peng
E. Hoque
J. Huang
LM&MA
65
0
0
01 Jun 2025
ACCESS DENIED INC: The First Benchmark Environment for Sensitivity Awareness
Dren Fazlija
Arkadij Orlov
Sandipan Sikdar
37
0
0
01 Jun 2025
Deontological Keyword Bias: The Impact of Modal Expressions on Normative Judgments of Language Models
Bumjin Park
Jinsil Lee
Jaesik Choi
20
0
0
01 Jun 2025
CC-Tuning: A Cross-Lingual Connection Mechanism for Improving Joint Multilingual Supervised Fine-Tuning
Yangfan Ye
Xiaocheng Feng
Zekun Yuan
Xiachong Feng
L. Qin
...
Yunfei Lu
Xiaohui Yan
Duyu Tang
Dandan Tu
Bing Qin
45
0
0
01 Jun 2025
Conformal Arbitrage: Risk-Controlled Balancing of Competing Objectives in Language Models
William Overman
Mohsen Bayati
37
0
0
01 Jun 2025
Previous
1
2
3
4
5
6
...
126
127
128
Next