ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,380 papers shown
Title
Learning to Reason without External Rewards
Learning to Reason without External Rewards
Xuandong Zhao
Zhewei Kang
Aosong Feng
Sergey Levine
Dawn Song
OffRLReLMLRM
135
8
0
26 May 2025
S2LPP: Small-to-Large Prompt Prediction across LLMs
S2LPP: Small-to-Large Prompt Prediction across LLMs
Liang Cheng
Tianyi Li
Zhaowei Wang
Mark Steedman
LRM
26
0
0
26 May 2025
MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection
MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection
Yinuo Xue
Eric Spero
Yun Sing Koh
Giovanni Russello
AAML
30
1
0
26 May 2025
CPA-RAG:Covert Poisoning Attacks on Retrieval-Augmented Generation in Large Language Models
CPA-RAG:Covert Poisoning Attacks on Retrieval-Augmented Generation in Large Language Models
Chunyang Li
Junwei Zhang
Anda Cheng
Zhuo Ma
Xinghua Li
Jianfeng Ma
SILMAAML
42
0
0
26 May 2025
Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models
Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models
Makesh Narsimhan Sreedhar
Traian Rebedea
Christopher Parisien
LRM
97
0
0
26 May 2025
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Yi Liu
Dianqing Liu
Mingye Zhu
Junbo Guo
Yongdong Zhang
Zhendong Mao
102
0
0
26 May 2025
Proxy-Free GFlowNet
Proxy-Free GFlowNet
Ruishuo Chen
Xun Wang
Rui Hu
Zhuoran Li
Longbo Huang
74
0
0
26 May 2025
What Can RL Bring to VLA Generalization? An Empirical Study
What Can RL Bring to VLA Generalization? An Empirical Study
Jijia Liu
Feng Gao
Bingwen Wei
Xinlei Chen
Qingmin Liao
Yi Wu
Chao Yu
Yu Wang
OffRL
302
0
0
26 May 2025
SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety
SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety
Geon-hyeong Kim
Youngsoo Jang
Yu Jin Kim
Byoungjip Kim
Honglak Lee
Kyunghoon Bae
Moontae Lee
28
2
0
26 May 2025
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
Fan Chen
Zeyu Jia
Alexander Rakhlin
Tengyang Xie
OffRL
31
0
0
26 May 2025
Interleaved Reasoning for Large Language Models via Reinforcement Learning
Interleaved Reasoning for Large Language Models via Reinforcement Learning
Roy Xie
David Qiu
Deepak Gopinath
Dong Lin
Yanchao Sun
Chong-Jun Wang
Saloni Potdar
Bhuwan Dhingra
KELMLRM
75
0
0
26 May 2025
Learning to Select In-Context Demonstration Preferred by Large Language Model
Learning to Select In-Context Demonstration Preferred by Large Language Model
Zheng Zhang
Shaocheng Lan
Lei Song
Jiang Bian
Yexin Li
Kan Ren
29
0
0
26 May 2025
Estimating LLM Consistency: A User Baseline vs Surrogate Metrics
Estimating LLM Consistency: A User Baseline vs Surrogate Metrics
Xiaoyuan Wu
Weiran Lin
Omer Akgul
Lujo Bauer
HILM
26
0
0
26 May 2025
FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement
FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement
Bingguang Hao
Maolin Wang
Zengzhuang Xu
Cunyin Peng
Yicheng Chen
Xiangyu Zhao
Jinjie Gu
Chenyi Zhuang
ReLMLRM
113
0
0
26 May 2025
JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models
JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models
Jiaxin Song
Yixu Wang
Jie Li
Rui Yu
Yan Teng
Xingjun Ma
Yingchun Wang
AAML
70
0
0
26 May 2025
Learning a Pessimistic Reward Model in RLHF
Learning a Pessimistic Reward Model in RLHF
Yinglun Xu
Hangoo Kang
Tarun Suresh
Yuxuan Wan
Gagandeep Singh
OffRL
66
0
0
26 May 2025
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs
Sangyeop Kim
Yohan Lee
Yongwoo Song
Kimin Lee
AAML
34
0
0
26 May 2025
Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries
Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries
Sahana Ramnath
Anurag Mudgil
Brihi Joshi
Skyler Hallinan
Xiang Ren
52
0
0
26 May 2025
Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers
Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers
Rihui Xin
Han Liu
Zecheng Wang
Yupeng Zhang
Dianbo Sui
Xiaolin Hu
Bingning Wang
SyDa
73
1
0
26 May 2025
SCAR: Shapley Credit Assignment for More Efficient RLHF
SCAR: Shapley Credit Assignment for More Efficient RLHF
Meng Cao
Shuyuan Zhang
Xiao-Wen Chang
Doina Precup
119
0
0
26 May 2025
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
Tao Wang
Ruipeng Zhang
Sicun Gao
OffRL
53
0
0
25 May 2025
A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
Yuzheng Hu
Fan Wu
Haotian Ye
David A. Forsyth
James Y. Zou
Nan Jiang
Jiaqi W. Ma
Han Zhao
OffRL
79
0
0
25 May 2025
Incentivizing High-Quality Human Annotations with Golden Questions
Incentivizing High-Quality Human Annotations with Golden Questions
Shang Liu
Zhongze Cai
Hanzhao Wang
Zhongyao Ma
Xiaocheng Li
82
0
0
25 May 2025
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
Xuanming Zhang
Yuxuan Chen
Min-Hsuan Yeh
Yixuan Li
LLMAGAI4CE
64
0
0
25 May 2025
The Price of Format: Diversity Collapse in LLMs
The Price of Format: Diversity Collapse in LLMs
Longfei Yun
Chenyang An
Zilong Wang
Letian Peng
Jingbo Shang
47
0
0
25 May 2025
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
Xiaoqiang Wang
Suyuchen Wang
Yun Zhu
Bang Liu
ReLMLRM
123
0
0
25 May 2025
An Embarrassingly Simple Defense Against LLM Abliteration Attacks
An Embarrassingly Simple Defense Against LLM Abliteration Attacks
Harethah Shairah
Hasan Hammoud
Bernard Ghanem
G. Turkiyyah
63
0
0
25 May 2025
VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use
VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use
Mingyuan Wu
Jingcheng Yang
Jize Jiang
Meitang Li
Kaizhuo Yan
Hanchao Yu
Minjia Zhang
Chengxiang Zhai
Klara Nahrstedt
LRM
173
0
0
25 May 2025
Do Large Language Models (Really) Need Statistical Foundations?
Do Large Language Models (Really) Need Statistical Foundations?
Weijie Su
274
0
0
25 May 2025
When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas
When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas
Steffen Backmann
David Guzman Piedrahita
Emanuel Tewolde
Rada Mihalcea
Bernhard Schölkopf
Zhijing Jin
90
0
0
25 May 2025
ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment
ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment
Xiaoqiang Lin
Arun Verma
Zhongxiang Dai
Daniela Rus
See-Kiong Ng
Bryan Kian Hsiang Low
275
0
0
25 May 2025
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Fengqi Zhu
Rongzhen Wang
Shen Nie
Xiaolu Zhang
Chunwei Wu
...
Jun Zhou
Jianfei Chen
Yankai Lin
Ji-Rong Wen
Chongxuan Li
195
2
0
25 May 2025
Mitigating Deceptive Alignment via Self-Monitoring
Mitigating Deceptive Alignment via Self-Monitoring
Jiaming Ji
Wenqi Chen
Kaile Wang
Donghai Hong
Sitong Fang
...
Jiayi Zhou
Juntao Dai
Sirui Han
Yike Guo
Yaodong Yang
LRM
57
2
0
24 May 2025
Synthesizing and Adapting Error Correction Data for Mobile Large Language Model Applications
Synthesizing and Adapting Error Correction Data for Mobile Large Language Model Applications
Yanxiang Zhang
Zheng Xu
Shanshan Wu
Yuanbo Zhang
Daniel Ramage
KELM
46
0
0
24 May 2025
Unraveling Misinformation Propagation in LLM Reasoning
Unraveling Misinformation Propagation in LLM Reasoning
Yiyang Feng
Yichen Wang
Shaobo Cui
Boi Faltings
Mina Lee
Jiawei Zhou
LRM
90
0
0
24 May 2025
MOSLIM:Align with diverse preferences in prompts through reward classification
MOSLIM:Align with diverse preferences in prompts through reward classification
Yu Zhang
Wanli Jiang
Zhengyu Yang
25
1
0
24 May 2025
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Meng Li
Guangda Huzhang
Haibo Zhang
Xiting Wang
Anxiang Zeng
42
0
0
24 May 2025
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
Ruichen Zhang
Rana Muhammad Shahroz Khan
Zhen Tan
Dawei Li
Song Wang
Tianlong Chen
LRM
63
0
0
24 May 2025
AI-Driven Climate Policy Scenario Generation for Sub-Saharan Africa
AI-Driven Climate Policy Scenario Generation for Sub-Saharan Africa
Rafiu Adekoya Badekale
Adewale Akinfaderin
46
0
0
24 May 2025
VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning
VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning
Guanxing Lu
Wenkai Guo
Chubin Zhang
Yuheng Zhou
Haonan Jiang
Zifeng Gao
Yansong Tang
Ziwei Wang
OffRL
118
0
0
24 May 2025
From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation
From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation
Zhihao Zhang
Yiran Zhang
Xiyue Zhou
Liting Huang
Imran Razzak
Preslav Nakov
Usman Naseem
24
0
0
24 May 2025
TNG-CLIP:Training-Time Negation Data Generation for Negation Awareness of CLIP
TNG-CLIP:Training-Time Negation Data Generation for Negation Awareness of CLIP
Yuliang Cai
Jesse Thomason
Mohammad Rostami
VLM
29
0
0
24 May 2025
Using Large Language Models to Tackle Fundamental Challenges in Graph Learning: A Comprehensive Survey
Using Large Language Models to Tackle Fundamental Challenges in Graph Learning: A Comprehensive Survey
Mengran Li
Pengyu Zhang
Wenbin Xing
Yijia Zheng
Klim Zaporojets
...
Jia Hu
Xiaolei Ma
Zhiyuan Liu
Paul Groth
Marcel Worring
AI4CE
151
0
0
24 May 2025
Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
Mengqi Liao
Xiangyu Xi
Ruinian Chen
Jia Leng
Yangen Hu
Ke Zeng
Shuai Liu
Huaiyu Wan
LRM
53
0
0
24 May 2025
Benchmarking and Rethinking Knowledge Editing for Large Language Models
Benchmarking and Rethinking Knowledge Editing for Large Language Models
Guoxiu He
Xin Song
Futing Wang
Aixin Sun
KELM
48
0
0
24 May 2025
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
C. Wang
Xiaoran Pan
Zihao Pan
Haofan Wang
Yiren Song
LRM
152
0
0
24 May 2025
Hybrid Latent Reasoning via Reinforcement Learning
Hybrid Latent Reasoning via Reinforcement Learning
Zhenrui Yue
Bowen Jin
Huimin Zeng
Honglei Zhuang
Zhen Qin
Jinsung Yoon
Lanyu Shang
Jiawei Han
Dong Wang
OffRLBDLLRM
70
0
0
24 May 2025
PromptWise: Online Learning for Cost-Aware Prompt Assignment in Generative Models
PromptWise: Online Learning for Cost-Aware Prompt Assignment in Generative Models
Xiaoyan Hu
Lauren Pick
Ho-fung Leung
Farzan Farnia
40
1
0
24 May 2025
Safety Alignment via Constrained Knowledge Unlearning
Safety Alignment via Constrained Knowledge Unlearning
Zesheng Shi
Yucheng Zhou
Jing Li
MUKELMAAML
84
2
0
24 May 2025
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Haoyuan Sun
Jiaqi Wu
Bo Xia
Yifu Luo
Yifei Zhao
Kai Qin
Xufei Lv
Tiantian Zhang
Yongzhe Chang
Xueqian Wang
OffRLLRM
209
0
0
24 May 2025
Previous
123...8910...126127128
Next