ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,370 papers shown
Title
Deontological Keyword Bias: The Impact of Modal Expressions on Normative Judgments of Language Models
Deontological Keyword Bias: The Impact of Modal Expressions on Normative Judgments of Language Models
Bumjin Park
Jinsil Lee
Jaesik Choi
20
0
0
01 Jun 2025
Existing Large Language Model Unlearning Evaluations Are Inconclusive
Existing Large Language Model Unlearning Evaluations Are Inconclusive
Zhili Feng
Yixuan Even Xu
Alexander Robey
Robert Kirk
Xander Davies
Yarin Gal
Avi Schwarzschild
J. Zico Kolter
MUELM
35
0
0
31 May 2025
GuideX: Guided Synthetic Data Generation for Zero-Shot Information Extraction
GuideX: Guided Synthetic Data Generation for Zero-Shot Information Extraction
Neil De La Fuente
Oscar Sainz
Iker García-Ferrero
Eneko Agirre
SyDa
44
0
0
31 May 2025
Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs
Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs
Yufa Zhou
S. Wang
Xingyu Dong
Xiangqi Jin
Yifang Chen
Yue Min
Kexin Yang
Xingzhang Ren
Dayiheng Liu
Linfeng Zhang
OffRLLRM
30
0
0
31 May 2025
Preference-based learning for news headline recommendation
Preference-based learning for news headline recommendation
Alexandre Bouras
A. Durand
Richard Khoury
15
0
0
31 May 2025
CLARIFY: Contrastive Preference Reinforcement Learning for Untangling Ambiguous Queries
CLARIFY: Contrastive Preference Reinforcement Learning for Untangling Ambiguous Queries
Ni Mu
Hao Hu
Xiao Hu
Yiqin Yang
Bo Xu
Qing-Shan Jia
57
0
0
31 May 2025
Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences?
Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences?
Zhuojun Gu
Quan Wang
Shuchu Han
24
0
0
31 May 2025
RLAE: Reinforcement Learning-Assisted Ensemble for LLMs
RLAE: Reinforcement Learning-Assisted Ensemble for LLMs
Y. Fu
Yuanheng Zhu
Jiajun Chai
Guojun Yin
Wei Lin
Qichao Zhang
Dongbin Zhao
25
0
0
31 May 2025
Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering
Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering
Linhao Ye
Lang Yu
Zhikai Lei
Qin Chen
Jie Zhou
Liang He
30
0
0
31 May 2025
Central Path Proximal Policy Optimization
Central Path Proximal Policy Optimization
Nikola Milosevic
Johannes Müller
Nico Scherf
27
0
0
31 May 2025
MIRROR: Cognitive Inner Monologue Between Conversational Turns for Persistent Reflection and Reasoning in Conversational LLMs
MIRROR: Cognitive Inner Monologue Between Conversational Turns for Persistent Reflection and Reasoning in Conversational LLMs
Nicole Hsing
LRM
36
0
0
31 May 2025
SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL
SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL
Ge Qu
Jinyang Li
Bowen Qin
Xiaolong Li
Nan Huo
Chenhao Ma
Reynold Cheng
25
0
0
31 May 2025
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning
Yiqing Liang
Jielin Qiu
Wenhao Ding
Zuxin Liu
James Tompkin
Mengdi Xu
Mengzhou Xia
Zhengzhong Tu
Laixi Shi
Jiacheng Zhu
OffRL
128
0
0
30 May 2025
Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization
Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization
Joschka Braun
Carsten Eickhoff
Seyed Ali Bahrainian
LLMSV
25
0
0
30 May 2025
AMSbench: A Comprehensive Benchmark for Evaluating MLLM Capabilities in AMS Circuits
AMSbench: A Comprehensive Benchmark for Evaluating MLLM Capabilities in AMS Circuits
Yichen Shi
Ze Zhang
Hongyang Wang
Zhuofu Tao
Zhongyi Li
Bingyu Chen
Yaxin Wang
Zhiping Yu
Ting-Jung Lin
Lei He
28
0
0
30 May 2025
On Symmetric Losses for Robust Policy Optimization with Noisy Preferences
On Symmetric Losses for Robust Policy Optimization with Noisy Preferences
Soichiro Nishimori
Yu Zhang
Thanawat Lodkaew
Masashi Sugiyama
NoLa
42
0
0
30 May 2025
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
Wei Fu
Jiaxuan Gao
Xujie Shen
Chen Zhu
Zhiyu Mei
...
Jun Mei
Jiashu Wang
Tongkai Yang
Binhang Yuan
Yi Wu
OffRLSyDaLRM
72
0
0
30 May 2025
Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation
Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation
Ximing Xing
Yandong Guan
Jing Zhang
Dong Xu
Qian Yu
LRM
86
0
0
30 May 2025
Adversarial Preference Learning for Robust LLM Alignment
Adversarial Preference Learning for Robust LLM Alignment
Yuanfu Wang
Pengyu Wang
Chenyang Xi
Bo Tang
Junyi Zhu
...
Keming Mao
Zhiyu Li
Feiyu Xiong
Jie Hu
Mingchuan Yang
AAML
29
0
0
30 May 2025
When Large Multimodal Models Confront Evolving Knowledge:Challenges and Pathways
When Large Multimodal Models Confront Evolving Knowledge:Challenges and Pathways
Kailin Jiang
Yuntao Du
Yukai Ding
Yuchen Ren
Ning Jiang
Zhi Gao
Zilong Zheng
Lei Liu
Bin Li
Qing Li
KELM
51
0
0
30 May 2025
Beyond Linear Steering: Unified Multi-Attribute Control for Language Models
Beyond Linear Steering: Unified Multi-Attribute Control for Language Models
Narmeen Oozeer
Luke Marks
Fazl Barez
Amirali Abdullah
LLMSV
33
0
0
30 May 2025
Whispers of Many Shores: Cultural Alignment through Collaborative Cultural Expertise
Whispers of Many Shores: Cultural Alignment through Collaborative Cultural Expertise
Shuai Feng
Wei-Chuang Chan
Srishti Chouhan
Junior Francisco Garcia Ayala
Srujananjali Medicherla
Kyle Clark
Mingwei Shi
27
0
0
30 May 2025
A Red Teaming Roadmap Towards System-Level Safety
A Red Teaming Roadmap Towards System-Level Safety
Zifan Wang
Christina Q. Knight
Jeremy Kritz
Willow Primack
Julian Michael
AAML
54
0
0
30 May 2025
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu
Tong Bu
Zecheng Hao
Jianhao Ding
Zhaofei Yu
30
0
0
30 May 2025
RAST: Reasoning Activation in LLMs via Small-model Transfer
RAST: Reasoning Activation in LLMs via Small-model Transfer
Siru Ouyang
Xinyu Zhu
Zilin Xiao
Minhao Jiang
Yu Meng
Jiawei Han
OffRLReLMLRM
29
0
0
30 May 2025
Intuitionistic Fuzzy Sets for Large Language Model Data Annotation: A Novel Approach to Side-by-Side Preference Labeling
Intuitionistic Fuzzy Sets for Large Language Model Data Annotation: A Novel Approach to Side-by-Side Preference Labeling
Yimin Du
39
0
0
30 May 2025
A Reward-driven Automated Webshell Malicious-code Generator for Red-teaming
A Reward-driven Automated Webshell Malicious-code Generator for Red-teaming
Yizhong Ding
AAML
24
0
0
30 May 2025
Emergent Abilities of Large Language Models under Continued Pretraining for Language Adaptation
Emergent Abilities of Large Language Models under Continued Pretraining for Language Adaptation
Ahmed Elhady
Eneko Agirre
Mikel Artetxe
CLLKELMELM
37
0
0
30 May 2025
Bootstrapping LLM Robustness for VLM Safety via Reducing the Pretraining Modality Gap
Bootstrapping LLM Robustness for VLM Safety via Reducing the Pretraining Modality Gap
Wenhan Yang
Spencer Stice
Ali Payani
Baharan Mirzasoleiman
MLLM
32
0
0
30 May 2025
Writing-Zero: Bridge the Gap Between Non-verifiable Tasks and Verifiable Rewards
Writing-Zero: Bridge the Gap Between Non-verifiable Tasks and Verifiable Rewards
Xun Lu
Yunyi Yang
Yongbo Gai
Kai Luo
Shihao Huang
Jianhe Lin
Xiaoxi Jiang
Guanjun Jiang
57
0
0
30 May 2025
MDPO: Multi-Granularity Direct Preference Optimization for Mathematical Reasoning
MDPO: Multi-Granularity Direct Preference Optimization for Mathematical Reasoning
Yunze Lin
LRM
15
0
0
30 May 2025
Tag-Evol: Achieving Efficient Instruction Evolving via Tag Injection
Tag-Evol: Achieving Efficient Instruction Evolving via Tag Injection
Yixuan Wang
Shiqi Zhou
Chuanzhe Guo
Qingfu Zhu
3DV
41
0
0
30 May 2025
ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases
ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases
Y. Li
Xiaojun Zeng
Chihua Fang
Jian Yang
Fucang Jia
L. Zhang
LM&MAELMAI4MH
51
0
0
30 May 2025
Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective
Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective
Erfan Loghmani
24
0
0
30 May 2025
The Hallucination Dilemma: Factuality-Aware Reinforcement Learning for Large Reasoning Models
The Hallucination Dilemma: Factuality-Aware Reinforcement Learning for Large Reasoning Models
Junyi Li
Hwee Tou Ng
OffRLHILMLRM
57
0
0
30 May 2025
Benchmarking Foundation Models for Zero-Shot Biometric Tasks
Benchmarking Foundation Models for Zero-Shot Biometric Tasks
Redwan Sony
Parisa Farmanifard
Hamzeh Alzwairy
Nitish Shukla
Arun Ross
CVBMVLM
58
0
0
30 May 2025
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment
Kundan Krishna
Joseph Y Cheng
Charles Maalouf
Leon A Gatys
25
0
0
30 May 2025
Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data
Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data
Seohyeong Lee
Eunwon Kim
Hwaran Lee
Buru Chang
70
0
0
29 May 2025
The End Of Universal Lifelong Identifiers: Identity Systems For The AI Era
The End Of Universal Lifelong Identifiers: Identity Systems For The AI Era
Shriphani Palakodety
27
0
0
29 May 2025
Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport
Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport
Yuu Jinnai
OT
50
0
0
29 May 2025
Mis-prompt: Benchmarking Large Language Models for Proactive Error Handling
Mis-prompt: Benchmarking Large Language Models for Proactive Error Handling
Jiayi Zeng
Yizhe Feng
Mengliang He
Wenhui Lei
Wei Zhang
Zeming Liu
Xiaoming Shi
Aimin Zhou
LRM
26
0
0
29 May 2025
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models
Chenbin Pan
Wenbin He
Zhengzhong Tu
Liu Ren
LRMVLM
75
0
0
29 May 2025
SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA
SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA
Minrui Luo
Fuhang Kuang
Yu Wang
Zirui Liu
Tianxing He
CLL
62
0
0
29 May 2025
DIP-R1: Deep Inspection and Perception with RL Looking Through and Understanding Complex Scenes
DIP-R1: Deep Inspection and Perception with RL Looking Through and Understanding Complex Scenes
Sungjune Park
Hyunjun Kim
Junho Kim
S. T. Kim
Y. Ro
LRM
123
0
0
29 May 2025
Identity resolution of software metadata using Large Language Models
Identity resolution of software metadata using Large Language Models
Eva Martín del Pico
Josep Lluís Gelpí
Salvador Capella-Gutiérrez
32
0
0
29 May 2025
Securing AI Agents with Information-Flow Control
Securing AI Agents with Information-Flow Control
Manuel Costa
Boris Köpf
Aashish Kolluri
Andrew Paverd
M. Russinovich
Ahmed Salem
Shruti Tople
Lukas Wutschitz
Santiago Zanella Béguelin
377
0
0
29 May 2025
SafeCOMM: What about Safety Alignment in Fine-Tuned Telecom Large Language Models?
SafeCOMM: What about Safety Alignment in Fine-Tuned Telecom Large Language Models?
Aladin Djuhera
S. Kadhe
Farhan Ahmed
Syed Zawad
Holger Boche
Walid Saad
35
0
0
29 May 2025
Stairway to Success: Zero-Shot Floor-Aware Object-Goal Navigation via LLM-Driven Coarse-to-Fine Exploration
Stairway to Success: Zero-Shot Floor-Aware Object-Goal Navigation via LLM-Driven Coarse-to-Fine Exploration
Zeying Gong
Rong Li
Tianshuai Hu
Ronghe Qiu
Lingdong Kong
Lingfeng Zhang
Yiyi Ding
Leying Zhang
Junwei Liang
62
0
0
29 May 2025
How Does Response Length Affect Long-Form Factuality
How Does Response Length Affect Long-Form Factuality
James Xu Zhao
Jimmy Z.J. Liu
Bryan Hooi
See-Kiong Ng
HILMKELM
74
0
0
29 May 2025
MAP: Revisiting Weight Decomposition for Low-Rank Adaptation
MAP: Revisiting Weight Decomposition for Low-Rank Adaptation
Chongjie Si
Zhiyi Shi
Yadao Wang
Xiaokang Yang
Susanto Rahardja
Wei Shen
64
0
0
29 May 2025
Previous
123...567...126127128
Next