ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,374 papers shown
Title
ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation
ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation
Maja Stahl
Timon Ziegenbein
Joonsuk Park
Henning Wachsmuth
ALMLRM
36
0
0
28 May 2025
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
Ganqu Cui
Yuchen Zhang
Jiacheng Chen
Lifan Yuan
Zhi Wang
...
Lei Bai
Wanli Ouyang
Yu Cheng
Bowen Zhou
Ning Ding
LRM
88
5
0
28 May 2025
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition
Hanting Chen
Yasheng Wang
Kai Han
Dong Li
Lin Li
...
Hailin Hu
Yehui Tang
Dacheng Tao
Xinghao Chen
Yunhe Wang
LRM
98
0
0
28 May 2025
Beyond path selection: Better LLMs for Scientific Information Extraction with MimicSFT and Relevance and Rule-induced(R$^2$)GRPO
Beyond path selection: Better LLMs for Scientific Information Extraction with MimicSFT and Relevance and Rule-induced(R2^22)GRPO
Ran Li
Shimin Di
Yuchen Liu
Chen Jing
Yu Qiu
Lei Chen
LRM
79
0
0
28 May 2025
Operationalizing CaMeL: Strengthening LLM Defenses for Enterprise Deployment
Operationalizing CaMeL: Strengthening LLM Defenses for Enterprise Deployment
Krti Tallam
Emma Miller
42
0
0
28 May 2025
MIRROR: Multi-agent Intra- and Inter-Reflection for Optimized Reasoning in Tool Learning
MIRROR: Multi-agent Intra- and Inter-Reflection for Optimized Reasoning in Tool Learning
Zikang Guo
Benfeng Xu
Xiaorui Wang
Zhendong Mao
83
0
0
27 May 2025
SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation
SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation
Ting Xu
Zhichao Huang
Jiankai Sun
Shanbo Cheng
Wai Lam
OffRL
29
0
0
27 May 2025
SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge
SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge
Fengqing Jiang
Fengbo Ma
Zhangchen Xu
Yuetai Li
Bhaskar Ramasubramanian
Luyao Niu
Bo Li
Xianyan Chen
Zhen Xiang
Radha Poovendran
ALMELM
76
1
0
27 May 2025
Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
Muzhi Zhu
Hao Zhong
Canyu Zhao
Zongze Du
Zheng Huang
...
Hao Chen
Cheng Zou
Jingdong Chen
Ming-Hsuan Yang
Chunhua Shen
LRM
174
0
0
27 May 2025
Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing
Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing
Peiming Guo
Meishan Zhang
Jianling Li
Min Zhang
Yue Zhang
36
0
0
27 May 2025
Towards Better Instruction Following Retrieval Models
Towards Better Instruction Following Retrieval Models
Yuchen Zhuang
Aaron Trinh
Rushi Qiang
Haotian Sun
Chao Zhang
Hanjun Dai
Bo Dai
150
1
0
27 May 2025
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Xiaojun Jia
Sensen Gao
Simeng Qin
Tianyu Pang
C. Du
Yihao Huang
Xinfeng Li
Yiming Li
Bo Li
Yang Liu
AAML
48
0
0
27 May 2025
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)
Anna Neumann
Elisabeth Kirsten
Muhammad Bilal Zafar
Jatinder Singh
62
0
0
27 May 2025
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
Yongchao Chen
Y. Liu
Junwei Zhou
Yilun Hao
Jingquan Wang
Yang Zhang
Chuchu Fan
OffRLReLMAI4TSSyDaALMLRM
79
0
0
27 May 2025
Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation
Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation
Tharindu Kumarage
Ninareh Mehrabi
Anil Ramakrishna
Xinyan Zhao
R. Zemel
Kai-Wei Chang
Aram Galstyan
Rahul Gupta
Charith Peris
LRM
36
0
0
27 May 2025
Relevance-driven Input Dropout: an Explanation-guided Regularization Technique
Relevance-driven Input Dropout: an Explanation-guided Regularization Technique
Shreyas Gururaj
Lars Grüne
Wojciech Samek
Sebastian Lapuschkin
Leander Weber
142
0
0
27 May 2025
Square$χ$PO: Differentially Private and Robust $χ^2$-Preference Optimization in Offline Direct Alignment
SquareχχχPO: Differentially Private and Robust χ2χ^2χ2-Preference Optimization in Offline Direct Alignment
Xingyu Zhou
Yulian Wu
Wenqian Weng
Francesco Orabona
83
0
0
27 May 2025
Fundamental Limits of Game-Theoretic LLM Alignment: Smith Consistency and Preference Matching
Fundamental Limits of Game-Theoretic LLM Alignment: Smith Consistency and Preference Matching
Zhekun Shi
Kaizhao Liu
Qi Long
Weijie J. Su
Jiancong Xiao
58
2
0
27 May 2025
Reinforcing General Reasoning without Verifiers
Reinforcing General Reasoning without Verifiers
Xiangxin Zhou
Zichen Liu
Anya Sims
Haonan Wang
Tianyu Pang
Chongxuan Li
Liang Wang
Min Lin
C. Du
OffRLLRM
78
2
0
27 May 2025
Personalized Query Auto-Completion for Long and Short-Term Interests with Adaptive Detoxification Generation
Personalized Query Auto-Completion for Long and Short-Term Interests with Adaptive Detoxification Generation
Zhibo Wang
Xiaoze Jiang
Zhiheng Qin
Enyun Yu
Han Li
51
1
0
27 May 2025
RRO: LLM Agent Optimization Through Rising Reward Trajectories
RRO: LLM Agent Optimization Through Rising Reward Trajectories
Zilong Wang
Jingfeng Yang
Sreyashi Nag
Samarth Varshney
Xianfeng Tang
Haoming Jiang
Jingbo Shang
Sheikh Sarwar
LRM
48
0
0
27 May 2025
STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
Kai Chen
Zihao He
Taiwei Shi
Kristina Lerman
ALMLLMSV
104
0
0
27 May 2025
Explainability of Large Language Models using SMILE: Statistical Model-agnostic Interpretability with Local Explanations
Explainability of Large Language Models using SMILE: Statistical Model-agnostic Interpretability with Local Explanations
Zeinab Dehghani
Koorosh Aslansefat
Adil Khan
Mohammed Naveed Akram
MILMLRM
136
0
0
27 May 2025
Breaking the Performance Ceiling in Complex Reinforcement Learning requires Inference Strategies
Breaking the Performance Ceiling in Complex Reinforcement Learning requires Inference Strategies
Félix Chalumeau
Daniel Rajaonarivonivelomanantsoa
Ruan de Kock
Claude Formanek
Sasha Abramowitz
...
Refiloe Shabe
Arnol Fokam
Siddarth S. Singh
Ulrich A. Mbou Sob
Arnu Pretorius
67
0
0
27 May 2025
MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding
MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding
Fuwen Luo
Shengfeng Lou
C. L. Philip Chen
Ziyue Wang
Chenliang Li
...
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
AI4TSLRM
81
0
0
27 May 2025
Pretrained LLMs Learn Multiple Types of Uncertainty
Pretrained LLMs Learn Multiple Types of Uncertainty
Roi Cohen
Omri Fahn
Gerard de Melo
43
0
0
27 May 2025
Concealment of Intent: A Game-Theoretic Analysis
Concealment of Intent: A Game-Theoretic Analysis
Xinbo Wu
A. Umrawal
Lav Varshney
33
0
0
27 May 2025
DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization
DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization
Shamil Ayupov
M. Nakhodnov
Anastasia Yaschenko
Andrey Kuznetsov
Aibek Alanov
52
0
0
27 May 2025
Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering
Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering
J. Zhu
Ye Liu
Meikai Bao
Kai Zhang
Yanghai Zhang
Qi Liu
LRM
49
0
0
26 May 2025
EGA-V1: Unifying Online Advertising with End-to-End Learning
EGA-V1: Unifying Online Advertising with End-to-End Learning
Junyan Qiu
Ze Wang
Fan Zhang
Zuowu Zheng
Jile Zhu
Jiangke Fan
Teng Zhang
Haitao Wang
Yongkang Wang
Xingxing Wang
OffRL
71
0
0
26 May 2025
Conversation Kernels: A Flexible Mechanism to Learn Relevant Context for Online Conversation Understanding
Conversation Kernels: A Flexible Mechanism to Learn Relevant Context for Online Conversation Understanding
Vibhor Agarwal
Arjoo Gupta
Suparna De
Nishanth R. Sastry
46
0
0
26 May 2025
Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO
Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO
Ruizhe Shi
Minhak Song
Runlong Zhou
Zihan Zhang
Maryam Fazel
S. S. Du
76
0
0
26 May 2025
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs
Sangyeop Kim
Yohan Lee
Yongwoo Song
Kimin Lee
AAML
34
0
0
26 May 2025
Does Rationale Quality Matter? Enhancing Mental Disorder Detection via Selective Reasoning Distillation
Does Rationale Quality Matter? Enhancing Mental Disorder Detection via Selective Reasoning Distillation
Hoyun Song
Huije Lee
Jisu Shin
Sukmin Cho
Changgeon Ko
Jong C. Park
AI4MHLRM
86
1
0
26 May 2025
Learning to Reason without External Rewards
Learning to Reason without External Rewards
Xuandong Zhao
Zhewei Kang
Aosong Feng
Sergey Levine
Dawn Song
OffRLReLMLRM
135
8
0
26 May 2025
On the Same Page: Dimensions of Perceived Shared Understanding in Human-AI Interaction
On the Same Page: Dimensions of Perceived Shared Understanding in Human-AI Interaction
Qingyu Liang
Jaime Banks
27
0
0
26 May 2025
GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models
GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models
Tingjia Shen
Hao Wang
Chuan Qin
Ruijun Sun
Yang Song
Defu Lian
Hengshu Zhu
Enhong Chen
57
0
0
26 May 2025
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
Fan Chen
Zeyu Jia
Alexander Rakhlin
Tengyang Xie
OffRL
29
0
0
26 May 2025
Estimating LLM Consistency: A User Baseline vs Surrogate Metrics
Estimating LLM Consistency: A User Baseline vs Surrogate Metrics
Xiaoyuan Wu
Weiran Lin
Omer Akgul
Lujo Bauer
HILM
26
0
0
26 May 2025
JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models
JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models
Jiaxin Song
Yixu Wang
Jie Li
Rui Yu
Yan Teng
Xingjun Ma
Yingchun Wang
AAML
70
0
0
26 May 2025
Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents
Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents
Tao Wu
Jingyuan Chen
Wang Lin
Mengze Li
Yumeng Zhu
Ang Li
Kun Kuang
Leilei Gan
LLMAGAI4Ed
55
1
0
26 May 2025
Energy-based Preference Optimization for Test-time Adaptation
Energy-based Preference Optimization for Test-time Adaptation
Yewon Han
Seoyun Yang
Taesup Kim
TTA
286
0
0
26 May 2025
SCAR: Shapley Credit Assignment for More Efficient RLHF
SCAR: Shapley Credit Assignment for More Efficient RLHF
Meng Cao
Shuyuan Zhang
Xiao-Wen Chang
Doina Precup
119
0
0
26 May 2025
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation
Yi Wu
Lingting Zhu
Shengju Qian
Lei Liu
Wandi Qiao
Lequan Yu
Bin Li
72
0
0
26 May 2025
Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries
Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries
Sahana Ramnath
Anurag Mudgil
Brihi Joshi
Skyler Hallinan
Xiang Ren
52
0
0
26 May 2025
ARM: Adaptive Reasoning Model
ARM: Adaptive Reasoning Model
Siye Wu
Jian Xie
Yikai Zhang
Aili Chen
Kai Zhang
Yu Su
Yanghua Xiao
LRM
82
0
0
26 May 2025
Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback
Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback
Mengdi Li
Jiaye Lin
Xufeng Zhao
Wenhao Lu
P. Zhao
S. Wermter
Di Wang
50
0
0
26 May 2025
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Yi Liu
Dianqing Liu
Mingye Zhu
Junbo Guo
Yongdong Zhang
Zhendong Mao
102
0
0
26 May 2025
Token-Importance Guided Direct Preference Optimization
Token-Importance Guided Direct Preference Optimization
Yang Ning
Lin Hai
Liu Yibo
Tian Baoliang
Liu Guoqing
Zhang Haijun
71
0
0
26 May 2025
FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement
FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement
Bingguang Hao
Maolin Wang
Zengzhuang Xu
Cunyin Peng
Yicheng Chen
Xiangyu Zhao
Jinjie Gu
Chenyi Zhuang
ReLMLRM
111
0
0
26 May 2025
Previous
123...789...126127128
Next