ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,370 papers shown
Title
ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection
ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection
Shang-Chi Tsai
Seiya Kawano
Angel García Contreras
Koichiro Yoshino
Yun-Nung Chen
LM&Ro
40
2
0
16 Jun 2025
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Reduan Achtibat
Patrick Kahardipraja
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
31
0
0
16 Jun 2025
Rethinking Hate Speech Detection on Social Media: Can LLMs Replace Traditional Models?
Rethinking Hate Speech Detection on Social Media: Can LLMs Replace Traditional Models?
Daman Deep Singh
Ramanuj Bhattacharjee
Abhijnan Chakraborty
27
0
0
15 Jun 2025
Rethinking DPO: The Role of Rejected Responses in Preference Misalignment
Rethinking DPO: The Role of Rejected Responses in Preference Misalignment
Jay Hyeon Cho
JunHyeok Oh
Myunsoo Kim
Byung-Jun Lee
19
0
0
15 Jun 2025
Jailbreak Strength and Model Similarity Predict Transferability
Jailbreak Strength and Model Similarity Predict Transferability
Rico Angell
Jannik Brinkmann
He He
24
0
0
15 Jun 2025
Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis
Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis
Yuanhe Tian
Xu Li
Wei Wang
Guoqing Jin
Pengsen Cheng
Yan Song
KELM
30
0
0
15 Jun 2025
Identifying and Investigating Global News Coverage of Critical Events Such as Disasters and Terrorist Attacks
Identifying and Investigating Global News Coverage of Critical Events Such as Disasters and Terrorist Attacks
Erica Cai
Xi Chen
Reagan Grey Keeney
Ethan Zuckerman
Brendan O'Connor
Przemyslaw A. Grabowicz
7
0
0
15 Jun 2025
Bridging the Digital Divide: Small Language Models as a Pathway for Physics and Photonics Education in Underdeveloped Regions
Bridging the Digital Divide: Small Language Models as a Pathway for Physics and Photonics Education in Underdeveloped Regions
Asghar Ghorbani
Hanieh Fattahi
34
0
0
14 Jun 2025
Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts
Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts
Zain Muhammad Mujahid
Dilshod Azizov
Maha Tufail Agro
Preslav Nakov
15
0
0
14 Jun 2025
Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics
Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics
Asifullah Khan
Muhammad Zaeem Khan
Saleha Jamshed
Sadia Ahmad
Aleesha Zainab
Kaynat Khatib
Faria Bibi
Abdul Rehman
OffRLLRM
29
0
0
14 Jun 2025
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
Yinghao Ma
Siyou Li
Juntao Yu
Emmanouil Benetos
Akira Maezawa
AuLLMVLM
29
0
0
14 Jun 2025
Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025
Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025
Zonghao Ying
Siyang Wu
Run Hao
Peng Ying
Shixuan Sun
...
Xianglong Liu
Dawn Song
Alan Yuille
Philip Torr
Dacheng Tao
33
0
0
14 Jun 2025
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
Sara Rajaram
R. J. Cotton
Fabian H. Sinz
17
0
0
14 Jun 2025
Theoretical Tensions in RLHF: Reconciling Empirical Success with Inconsistencies in Social Choice Theory
Theoretical Tensions in RLHF: Reconciling Empirical Success with Inconsistencies in Social Choice Theory
Jiancong Xiao
Zhekun Shi
Kaizhao Liu
Q. Long
Weijie J. Su
39
0
0
14 Jun 2025
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search
Zhenyu Hou
Ziniu Hu
Yujiang Li
Rui Lu
Jie Tang
Yuxiao Dong
OffRLLRM
23
0
0
13 Jun 2025
Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning with Heterogeneous LoRA Allocation
Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning with Heterogeneous LoRA Allocation
Zikai Zhang
Ping Liu
Jiahao Xu
Rui Hu
26
0
0
13 Jun 2025
InfoFlood: Jailbreaking Large Language Models with Information Overload
InfoFlood: Jailbreaking Large Language Models with Information Overload
Advait Yadav
Haibo Jin
Man Luo
Jun Zhuang
Haohan Wang
AAML
23
0
0
13 Jun 2025
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
Avinash Baidya
Kamalika Das
Xiang Gao
LLMAG
17
0
0
13 Jun 2025
Eliciting Reasoning in Language Models with Cognitive Tools
Eliciting Reasoning in Language Models with Cognitive Tools
Brown Ebouky
Andrea Bartezzaghi
Mattia Rigotti
KELMLRMELM
11
0
0
13 Jun 2025
Personalized LLM Decoding via Contrasting Personal Preference
Personalized LLM Decoding via Contrasting Personal Preference
Hyungjune Bu
Chanjoo Jung
Minjae Kang
Jaehyung Kim
24
0
0
13 Jun 2025
Diffusion-Based Electrocardiography Noise Quantification via Anomaly Detection
Diffusion-Based Electrocardiography Noise Quantification via Anomaly Detection
Tae-Seong Han
Jae-Wook Heo
Hakseung Kim
Cheol-Hui Lee
Hyub Huh
Eue-Keun Choi
Dong-Joo Kim
DiffM
32
0
0
13 Jun 2025
Mind the XAI Gap: A Human-Centered LLM Framework for Democratizing Explainable AI
Mind the XAI Gap: A Human-Centered LLM Framework for Democratizing Explainable AI
Eva Paraschou
Ioannis Arapakis
Sofia Yfantidou
Sebastian Macaluso
Athena Vakali
24
0
0
13 Jun 2025
EQA-RM: A Generative Embodied Reward Model with Test-time Scaling
EQA-RM: A Generative Embodied Reward Model with Test-time Scaling
Yuhang Chen
Zhen Tan
Tianlong Chen
113
0
0
12 Jun 2025
Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific Papers
Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific Papers
Xanh Ho
Sunisth Kumar
Yun-Ang Wu
Florian Boudin
Atsuhiro Takasu
Akiko Aizawa
LMTD
122
0
0
12 Jun 2025
Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs
Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs
Yucong Luo
Yitong Zhou
Mingyue Cheng
Jiahao Wang
Daoyu Wang
Tingyue Pan
Jintao Zhang
AI4TSLRM
117
0
0
12 Jun 2025
Self-Adapting Language Models
Self-Adapting Language Models
Adam Zweiger
Jyothish Pari
Han Guo
Ekin Akyürek
Yoon Kim
Pulkit Agrawal
KELMLRM
139
0
0
12 Jun 2025
LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy Logic
LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy Logic
Weibing Zheng
Laurah Turner
Jess Kropczynski
Murat Ozer
Tri Nguyen
S. Halse
ELM
18
0
0
12 Jun 2025
Discovering Hierarchical Latent Capabilities of Language Models via Causal Representation Learning
Discovering Hierarchical Latent Capabilities of Language Models via Causal Representation Learning
Jikai Jin
Vasilis Syrgkanis
Sham Kakade
Hanlin Zhang
ELM
125
1
0
12 Jun 2025
GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models
GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models
Evelyn Ma
Duo Zhou
Peizhi Niu
Huiting Zhou
Huan Zhang
Olgica Milenković
S. Rasoul Etesami
MU
116
0
0
12 Jun 2025
The Alignment Trap: Complexity Barriers
The Alignment Trap: Complexity Barriers
Jasper Yao
109
0
0
12 Jun 2025
Spurious Rewards: Rethinking Training Signals in RLVR
Spurious Rewards: Rethinking Training Signals in RLVR
Rulin Shao
Shuyue Stella Li
Rui Xin
Scott Geng
Yiping Wang
...
Ranjay Krishna
Yulia Tsvetkov
Hannaneh Hajishirzi
Pang Wei Koh
Luke Zettlemoyer
OffRLReLMLRM
129
11
0
12 Jun 2025
OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems
OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems
Xiaozhe Li
Jixuan Chen
Xinyu Fang
Shengyuan Ding
Haodong Duan
Qingwen Liu
Kai-xiang Chen
LLMAGLRM
108
0
0
12 Jun 2025
Collaborative Prediction: To Join or To Disjoin Datasets
Collaborative Prediction: To Join or To Disjoin Datasets
Kyung Rok Kim
Yansong Wang
Xiaocheng Li
Guanting Chen
FedML
37
0
0
12 Jun 2025
DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision
Xiandong Zou
Ruihao Xia
Hongsong Wang
Pan Zhou
AI4TS
64
0
0
11 Jun 2025
VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
Hao Peng
Yunjia Qi
Xiaozhi Wang
Bin Xu
Lei Hou
Juanzi Li
OffRL
77
0
0
11 Jun 2025
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
Zijie Wu
Chaohui Yu
Fan Wang
Xiang Bai
AI4CE
61
0
0
11 Jun 2025
Vision Generalist Model: A Survey
Vision Generalist Model: A Survey
Ziyi Wang
Yongming Rao
Shuofeng Sun
Xinrun Liu
Yi Wei
...
Zuyan Liu
Yanbo Wang
Hongmin Liu
Jie Zhou
Jiwen Lu
70
0
0
11 Jun 2025
Application-Driven Value Alignment in Agentic AI Systems: Survey and Perspectives
Application-Driven Value Alignment in Agentic AI Systems: Survey and Perspectives
Wei Zeng
Hengshu Zhu
Chuan Qin
Han Wu
Yihang Cheng
...
Xiaowei Jin
Yinuo Shen
Zhenxing Wang
Feimin Zhong
Hui Xiong
AI4TS
72
0
0
11 Jun 2025
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning
Songze Li
Mingxuan Zhang
Kang Wei
Shouling Ji
AAML
90
0
0
11 Jun 2025
On a few pitfalls in KL divergence gradient estimation for RL
Yunhao Tang
Rémi Munos
64
0
0
11 Jun 2025
RePO: Replay-Enhanced Policy Optimization
RePO: Replay-Enhanced Policy Optimization
Siheng Li
Zhanhui Zhou
W. Lam
Chao Yang
Chaochao Lu
OffRL
85
0
0
11 Jun 2025
LEO-VL: Towards 3D Vision-Language Generalists via Data Scaling with Efficient Representation
J. Huang
Xiaojian Ma
Xiongkun Linghu
Yue Fan
Junchao He
...
Qing Li
Song-Chun Zhu
Yixin Chen
Baoxiong Jia
Siyuan Huang
82
0
0
11 Jun 2025
EnerBridge-DPO: Energy-Guided Protein Inverse Folding with Markov Bridges and Direct Preference Optimization
EnerBridge-DPO: Energy-Guided Protein Inverse Folding with Markov Bridges and Direct Preference Optimization
Dingyi Rong
Haotian Lu
Wenzhuo Zheng
Fan Zhang
Shuangjia Zheng
Ning Liu
53
0
0
11 Jun 2025
Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
Shuai Wang
Zhenhua Liu
Jiaheng Wei
Xuanwu Yin
Dong Li
E. Barsoum
LRM
84
0
0
11 Jun 2025
Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
Andreas Schlaginhaufen
Reda Ouhamma
Maryam Kamgarpour
71
0
0
11 Jun 2025
When Meaning Stays the Same, but Models Drift: Evaluating Quality of Service under Token-Level Behavioral Instability in LLMs
When Meaning Stays the Same, but Models Drift: Evaluating Quality of Service under Token-Level Behavioral Instability in LLMs
Xiao Li
Joel Kreuzwieser
Alan Peters
56
0
0
11 Jun 2025
ThinkQE: Query Expansion via an Evolving Thinking Process
ThinkQE: Query Expansion via an Evolving Thinking Process
Yibin Lei
Tao Shen
Andrew Yates
ReLMLRM
43
0
0
10 Jun 2025
GFRIEND: Generative Few-shot Reward Inference through EfficieNt DPO
GFRIEND: Generative Few-shot Reward Inference through EfficieNt DPO
Yiyang Zhao
Huiyu Bai
Xuejiao Zhao
OffRL
31
0
0
10 Jun 2025
AsFT: Anchoring Safety During LLM Fine-Tuning Within Narrow Safety Basin
Shuo Yang
Qihui Zhang
Yuyang Liu
Yue Huang
Xiaojun Jia
...
Jiayu Yao
Jigang Wang
Hailiang Dai
Yibing Song
Li Yuan
48
0
0
10 Jun 2025
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
Phuc Minh Nguyen
Ngoc-Hieu Nguyen
Duy Nguyen
Anji Liu
An Mai
Binh T. Nguyen
Daniel Sonntag
Khoa D. Doan
31
0
0
10 Jun 2025
Previous
12345...126127128
Next