Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 4,613 papers shown
Title
Joint Action Language Modelling for Transparent Policy Execution
Theodor Wulff
R. S. Maharjan
Xinyun Chi
Angelo Cangelosi
29
0
0
14 Apr 2025
CHARM: Calibrating Reward Models With Chatbot Arena Scores
Xiao Zhu
Chenmien Tan
Pinzhen Chen
Rico Sennrich
Yanlin Zhang
Hanxu Hu
ALM
30
0
0
14 Apr 2025
RealSafe-R1: Safety-Aligned DeepSeek-R1 without Compromising Reasoning Capability
Y. Zhang
Zihao Zeng
Dongbai Li
Yao Huang
Zhijie Deng
Yinpeng Dong
LRM
38
4
0
14 Apr 2025
Training Small Reasoning LLMs with Cognitive Preference Alignment
Wenrui Cai
Chengyu Wang
Junbing Yan
Jun Huang
Xiangzhong Fang
LRM
26
1
0
14 Apr 2025
Augmented Relevance Datasets with Fine-Tuned Small LLMs
Quentin Fitte-Rey
Matyas Amrouche
Romain Deveaud
39
0
0
14 Apr 2025
Refining Financial Consumer Complaints through Multi-Scale Model Interaction
Bo-Wei Chen
An-Zi Yen
Chung-Chi Chen
AILaw
51
0
0
14 Apr 2025
InstructEngine: Instruction-driven Text-to-Image Alignment
Xingyu Lu
Yihan Hu
Yang Zhang
Kaiyu Jiang
Changyi Liu
...
Bin Wen
C. Yuan
Fan Yang
Tingting Gao
Di Zhang
48
0
0
14 Apr 2025
LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline
Biao Fu
Minpeng Liao
Kai Fan
Chengxi Li
Li Zhang
Yidong Chen
Xiaodong Shi
OffRL
141
1
0
13 Apr 2025
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training
Zhenting Wang
Guofeng Cui
Kun Wan
Wentian Zhao
35
1
0
13 Apr 2025
QM-ToT: A Medical Tree of Thoughts Reasoning Framework for Quantized Model
Zongxian Yang
Jiayu Qian
Z. Huang
Kay Chen Tan
LM&MA
LRM
31
0
0
13 Apr 2025
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
Weixiang Zhao
Jiahe Guo
Yulin Hu
Yang Deng
An Zhang
...
Xinyang Han
Yanyan Zhao
Bing Qin
Tat-Seng Chua
Ting Liu
AAML
LLMSV
43
0
0
13 Apr 2025
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Kang Yang
Guanhong Tao
X. Chen
Jun Xu
36
0
0
13 Apr 2025
Enhancing Mathematical Reasoning in Large Language Models with Self-Consistency-Based Hallucination Detection
MingShan Liu
Shi Bo
Jialing Fang
LRM
30
0
0
13 Apr 2025
Kongzi: A Historical Large Language Model with Fact Enhancement
Jiashu Yang
Ningning Wang
Yian Zhao
Chaoran Feng
Junjia Du
Hao Pang
Zhirui Fang
Xuxin Cheng
HILM
ALM
LRM
41
0
0
13 Apr 2025
SaRO: Enhancing LLM Safety through Reasoning-based Alignment
Yutao Mou
Yuxiao Luo
Shikun Zhang
Wei Ye
LLMSV
LRM
36
0
0
13 Apr 2025
Computer-Aided Layout Generation for Building Design: A Review
Jiachen Liu
Yuan Xue
Haomiao Ni
Rui Yu
Zihan Zhou
S. X. Huang
3DV
AI4CE
79
0
0
13 Apr 2025
Large Language Models as Particle Swarm Optimizers
Yamato Shinohara
Jinglue Xu
Tianshui Li
Hitoshi Iba
31
0
0
12 Apr 2025
PathVLM-R1: A Reinforcement Learning-Driven Reasoning Model for Pathology Visual-Language Tasks
Jian Wu
Hao Yang
Xinhua Zeng
Guibing He
Zhengzhang Chen
Zhu Li
Xinming Zhang
Yangyang Ma
Run Fang
Yang Liu
LRM
148
0
0
12 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
45
2
0
12 Apr 2025
Continuum-Interaction-Driven Intelligence: Human-Aligned Neural Architecture via Crystallized Reasoning and Fluid Generation
Pengcheng Zhou
Zhiqiang Nie
Haochen Li
53
0
0
12 Apr 2025
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
Chengyu Wang
Taolin Zhang
Richang Hong
Jun Huang
ReLM
LRM
45
1
0
12 Apr 2025
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training
Mingyu Liang
Hiwot Tadese Kassa
Wenyin Fu
Brian Coutinho
Louis Feng
Christina Delimitrou
28
0
0
12 Apr 2025
Feature-Aware Malicious Output Detection and Mitigation
Weilong Dong
Peiguang Li
Yu Tian
Xinyi Zeng
Fengdi Li
Sirui Wang
AAML
24
0
0
12 Apr 2025
Playpen: An Environment for Exploring Learning Through Conversational Interaction
Nicola Horst
Davide Mazzaccara
Antonia Schmidt
Michael Sullivan
Filippo Momentè
...
Alexander Koller
Oliver Lemon
David Schlangen
Mario Giulianelli
Alessandro Suglia
OffRL
34
0
0
11 Apr 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
52
2
0
11 Apr 2025
Large Language Models Could Be Rote Learners
Yuyang Xu
Renjun Hu
Haochao Ying
Jian Wu
Xing Shi
Wei Lin
ELM
175
0
0
11 Apr 2025
X-Guard: Multilingual Guard Agent for Content Moderation
Bibek Upadhayay
Vahid Behzadan
Ph.D
31
1
0
11 Apr 2025
2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization
Mengyang Li
Zhong Zhang
27
0
0
10 Apr 2025
NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark
Vladislav Mikhailov
Tita Ranveig Enstad
David Samuel
Hans Christian Farsethås
Andrey Kutuzov
Erik Velldal
Lilja Øvrelid
ELM
45
0
0
10 Apr 2025
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
En Yu
Kangheng Lin
Liang Zhao
Jisheng Yin
Yana Wei
...
Zheng Ge
Xiangyu Zhang
Daxin Jiang
Jingyu Wang
Wenbing Tao
VLM
OffRL
LRM
40
3
0
10 Apr 2025
Enhancing Player Enjoyment with a Two-Tier DRL and LLM-Based Agent System for Fighting Games
Shouren Wang
Zehua Jiang
Fernando Sliva
Sam Earle
Julian Togelius
28
0
0
10 Apr 2025
Plan-and-Refine: Diverse and Comprehensive Retrieval-Augmented Generation
Alireza Salemi
Chris Samarinas
Hamed Zamani
36
0
0
10 Apr 2025
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
Juzheng Zhang
Jiacheng You
Ashwinee Panda
Tom Goldstein
MoMe
53
1
0
10 Apr 2025
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
Tuhin Chakrabarty
Philippe Laban
C. Wu
34
1
0
10 Apr 2025
Integrating Cognitive Processing Signals into Language Models: A Review of Advances, Applications and Future Directions
Angela Lopez-Cardona
Sebastian Idesis
Ioannis Arapakis
31
0
0
09 Apr 2025
Toward Holistic Evaluation of Recommender Systems Powered by Generative Models
Yashar Deldjoo
Nikhil Mehta
M. Sathiamoorthy
Shuai Zhang
Pablo Castells
Julian McAuley
EGVM
ELM
72
1
0
09 Apr 2025
EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture
Wenfeng Feng
Guoying Sun
31
0
0
09 Apr 2025
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang
Fengqi Liu
Yixing Lu
Qin Zhao
Pingyu Wu
...
Ran Yi
Yang Cao
Lizhuang Ma
Zheng-jun Zha
Junting Dong
3DGS
47
0
0
09 Apr 2025
Bridging the Gap Between Preference Alignment and Machine Unlearning
Xiaohua Feng
Yuyuan Li
Huwei Ji
Jiaming Zhang
L. Zhang
Tianyu Du
Chaochao Chen
MU
43
0
0
09 Apr 2025
Perception in Reflection
Yana Wei
Liang Zhao
Kangheng Lin
En Yu
Yuang Peng
...
Jianjian Sun
Haoran Wei
Zheng Ge
Xiangyu Zhang
Vishal M. Patel
31
0
0
09 Apr 2025
Societal Impacts Research Requires Benchmarks for Creative Composition Tasks
Judy Hanwen Shen
Carlos Guestrin
36
0
0
09 Apr 2025
PinRec: Outcome-Conditioned, Multi-Token Generative Retrieval for Industry-Scale Recommendation Systems
Anirudhan Badrinath
Prabhat Agarwal
Laksh Bhasin
Jaewon Yang
Jiajing Xu
Charles R. Rosenberg
LRM
45
0
0
09 Apr 2025
AssistanceZero: Scalably Solving Assistance Games
Cassidy Laidlaw
Eli Bronstein
Timothy Guo
Dylan Feng
Lukas Berglund
Justin Svegliato
Stuart J. Russell
Anca Dragan
37
1
0
09 Apr 2025
CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data Optimization
Jing Yao
Xiaoyuan Yi
Jindong Wang
Zhicheng Dou
Xing Xie
31
0
0
09 Apr 2025
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Wei Chen
Xin Yan
Bin Wen
Fan Yang
Tingting Gao
Di Zhang
Long Chen
MLLM
97
0
0
09 Apr 2025
Reasoning Towards Fairness: Mitigating Bias in Language Models through Reasoning-Guided Fine-Tuning
Sanchit Kabra
Akshita Jha
Chandan K. Reddy
LRM
28
0
0
08 Apr 2025
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Monojit Choudhury
Shivam Chauhan
Rocktim Jyoti Das
Dhruv Sahnan
Xudong Han
...
Rituraj Joshi
Gurpreet Gosal
Avraham Sheinin
Natalia Vassilieva
Preslav Nakov
33
0
0
08 Apr 2025
Information-Theoretic Reward Decomposition for Generalizable RLHF
Liyuan Mao
Haoran Xu
Amy Zhang
Weinan Zhang
Chenjia Bai
35
0
0
08 Apr 2025
Separator Injection Attack: Uncovering Dialogue Biases in Large Language Models Caused by Role Separators
Xitao Li
Haoran Wang
Jiang Wu
Ting Liu
AAML
26
0
0
08 Apr 2025
FactGuard: Leveraging Multi-Agent Systems to Generate Answerable and Unanswerable Questions for Enhanced Long-Context LLM Extraction
Qian Zhang
Fang Li
Jie Wang
Lingfeng Qiao
Yifei Yu
Di Yin
Xingchen Sun
RALM
65
0
0
08 Apr 2025
Previous
1
2
3
...
5
6
7
...
91
92
93
Next