ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLM
    ALM
ArXivPDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 7,214 papers shown
Title
Integrating Large Language Models with Human Expertise for Disease Detection in Electronic Health Records
Integrating Large Language Models with Human Expertise for Disease Detection in Electronic Health Records
Jie Pan
Seungwon Lee
Cheligeer Cheligeer
Elliot A. Martin
Kiarash Riazi
Hude Quan
Na Li
LM&MA
26
0
0
31 Mar 2025
Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation
Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation
Yong Li
Bo Liu
Sheng Huang
Zhe Zhang
Xiaotong Yuan
Richang Hong
46
0
0
31 Mar 2025
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment
Jiangnan Li
Thuy-Trang Vu
Christian Herold
Amirhossein Tebbifakhr
Shahram Khadivi
Gholamreza Haffari
35
0
0
31 Mar 2025
LLMs for Explainable AI: A Comprehensive Survey
LLMs for Explainable AI: A Comprehensive Survey
Ahsan Bilal
David Ebert
Beiyu Lin
72
1
0
31 Mar 2025
Order Independence With Finetuning
Order Independence With Finetuning
Katrina Brown
Reid McIlroy
35
0
0
30 Mar 2025
FeRG-LLM : Feature Engineering by Reason Generation Large Language Models
FeRG-LLM : Feature Engineering by Reason Generation Large Language Models
Jeonghyun Ko
Gyeongyun Park
Donghoon Lee
Kyunam Lee
LRM
57
0
0
30 Mar 2025
An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering
An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering
Alexander Murphy
Mohd Sanad Zaki Rizvi
Aden Haussmann
Ping Nie
Guifu Liu
Aryo Pradipta Gema
Pasquale Minervini
52
0
0
30 Mar 2025
EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing
EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing
Hongxiang Jiang
Jihao Yin
Qixiong Wang
Jiaqi Feng
Guo Chen
53
0
0
30 Mar 2025
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Tuo Liang
Zhe Hu
Jing Li
Hao Zhang
Yiren Lu
...
Yiran Qiao
Disheng Liu
Jeirui Peng
Jing Ma
Yu Yin
54
0
0
29 Mar 2025
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury
Hanan Gani
Nishit Anand
Sayan Nag
Ruohan Gao
Mohamed Elhoseiny
Salman Khan
Dinesh Manocha
LRM
56
0
0
29 Mar 2025
Generating Synthetic Oracle Datasets to Analyze Noise Impact: A Study on Building Function Classification Using Tweets
Generating Synthetic Oracle Datasets to Analyze Noise Impact: A Study on Building Function Classification Using Tweets
Shanshan Bai
Anna Kruspe
X. X. Zhu
53
0
0
28 Mar 2025
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Wei Shen
Guanlin Liu
Zheng Wu
Ruofei Zhu
Qingping Yang
Chao Xin
Yu Yue
Lin Yan
92
8
0
28 Mar 2025
Learning to Reason for Long-Form Story Generation
Learning to Reason for Long-Form Story Generation
Alexander Gurung
Mirella Lapata
ReLM
OffRL
LRM
63
1
0
28 Mar 2025
Learning to Instruct for Visual Instruction Tuning
Learning to Instruct for Visual Instruction Tuning
Zhihan Zhou
Feng Hong
Jiaan Luo
Jiangchao Yao
Dongsheng Li
Bo Han
Yujie Zhang
Yanfeng Wang
VLM
68
0
0
28 Mar 2025
RLDBF: Enhancing LLMs Via Reinforcement Learning With DataBase FeedBack
RLDBF: Enhancing LLMs Via Reinforcement Learning With DataBase FeedBack
Weichen Dai
Zijie Dai
Zhijie Huang
Yixuan Pan
Xinhe Li
Xi Li
Yi Zhou
Ji Qi
Wu Jiang
24
0
0
28 Mar 2025
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Xueliang Wang
Linrui Ma
Jerry Huang
Peng Lu
Prasanna Parthasarathi
Xiao-Wen Chang
Boxing Chen
Yufei Cui
KELM
53
1
0
28 Mar 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
Zizhou Liu
Ziwei Gong
Lin Ai
Zheng Hui
Run Chen
Colin Wayne Leach
Michelle R. Greene
Julia Hirschberg
LLMAG
195
0
0
28 Mar 2025
SocialGen: Modeling Multi-Human Social Interaction with Language Models
SocialGen: Modeling Multi-Human Social Interaction with Language Models
Heng Yu
Juze Zhang
Changan Chen
Tiange Xiang
Yusu Fang
Juan Carlos Niebles
Ehsan Adeli
VGen
54
0
0
28 Mar 2025
Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering
Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering
Magdalena Kaiser
Gerhard Weikum
42
0
0
28 Mar 2025
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Ruining Li
Chuanxia Zheng
Christian Rupprecht
Andrea Vedaldi
42
1
0
28 Mar 2025
Token-Driven GammaTune: Adaptive Calibration for Enhanced Speculative Decoding
Token-Driven GammaTune: Adaptive Calibration for Enhanced Speculative Decoding
Aayush Gautam
Susav Shrestha
Narasimha Annapareddy
56
0
0
28 Mar 2025
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Weiqi Li
X. Zhang
Shijie Zhao
Yujie Zhang
Junlin Li
Li Zhang
Jian Zhang
50
3
0
28 Mar 2025
Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT's Capabilities in Generating Metamorphic Relations
Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT's Capabilities in Generating Metamorphic Relations
Yifan Zhang
Dave Towey
Matthew Pike
Q. Luu
Huai Liu
T. Chen
34
0
0
28 Mar 2025
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
Chung-En Sun
Ge Yan
Tsui-Wei Weng
KELM
LRM
65
1
0
27 Mar 2025
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Tong Nie
Jian Sun
Wei Ma
72
1
0
27 Mar 2025
3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models
3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models
Yujie Zhang
Mengchen Zhang
Tong Wu
Tengfei Wang
Gordon Wetzstein
Dahua Lin
Ziwei Liu
ELM
79
0
0
27 Mar 2025
FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs
FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs
Xiaoqin Wang
Xusen Ma
Xianxu Hou
Meidan Ding
Yudong Li
Junliang Chen
Wenting Chen
Xiaoyang Peng
LinLin Shen
CVBM
73
0
0
27 Mar 2025
Boosting Large Language Models with Mask Fine-Tuning
Boosting Large Language Models with Mask Fine-Tuning
M. Zhang
Yue Bai
Huan Wang
Yizhou Wang
Qihua Dong
Y. Fu
CLL
58
0
0
27 Mar 2025
SWI: Speaking with Intent in Large Language Models
SWI: Speaking with Intent in Large Language Models
Yuwei Yin
EunJeong Hwang
Giuseppe Carenini
LRM
51
0
0
27 Mar 2025
Controlling Large Language Model with Latent Actions
Controlling Large Language Model with Latent Actions
Chengxing Jia
Ziniu Li
Pengyuan Wang
Yi-Chen Li
Zhenyu Hou
Yuxiao Dong
Y. Yu
58
0
0
27 Mar 2025
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
Zitian Wang
Yue Liao
Kang Rong
Fengyun Rao
Yibo Yang
Si Liu
75
0
0
26 Mar 2025
Optimizing Safe and Aligned Language Generation: A Multi-Objective GRPO Approach
Optimizing Safe and Aligned Language Generation: A Multi-Objective GRPO Approach
Xuying Li
Zhuo Li
Yuji Kosuga
Victor Bian
53
3
0
26 Mar 2025
ASGO: Adaptive Structured Gradient Optimization
ASGO: Adaptive Structured Gradient Optimization
Kang An
Yuxing Liu
Rui Pan
Shiqian Ma
D. Goldfarb
Tong Zhang
ODL
97
2
0
26 Mar 2025
TransDiffSBDD: Causality-Aware Multi-Modal Structure-Based Drug Design
TransDiffSBDD: Causality-Aware Multi-Modal Structure-Based Drug Design
Xiuyuan Hu
Guoqing Liu
Can Chen
Yang Zhao
Hao Zhang
Xue Liu
66
2
0
26 Mar 2025
A multi-agentic framework for real-time, autonomous freeform metasurface design
A multi-agentic framework for real-time, autonomous freeform metasurface design
Robert Lupoiu
Yixuan Shao
Tianxiang Dai
Chenkai Mao
Kofi Edee
Jonathan A. Fan
AI4CE
73
0
0
26 Mar 2025
Generating Synthetic Data with Formal Privacy Guarantees: State of the Art and the Road Ahead
Generating Synthetic Data with Formal Privacy Guarantees: State of the Art and the Road Ahead
Viktor Schlegel
Anil A Bharath
Zilong Zhao
Kevin Yee
71
0
0
26 Mar 2025
Cyborg Data: Merging Human with AI Generated Training Data
Cyborg Data: Merging Human with AI Generated Training Data
Kai North
Christopher Ormerod
37
0
0
26 Mar 2025
Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy
Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy
Joonhyun Jeong
Seyun Bae
Yeonsung Jung
Jaeryong Hwang
Eunho Yang
AAML
45
1
0
26 Mar 2025
Multi-head Reward Aggregation Guided by Entropy
Multi-head Reward Aggregation Guided by Entropy
Xiaomin Li
Xupeng Chen
Jingxuan Fan
Eric Hanchen Jiang
Mingye Gao
AAML
60
2
0
26 Mar 2025
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Jiayi Ji
Jie Lou
Debing Zhang
Rongrong Ji
95
0
0
26 Mar 2025
GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization
GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization
Zhouhong Gu
Xingzhou Chen
Xiaoran Shi
Tao Wang
Suhang Zheng
Tianyu Li
Hongwei Feng
Yanghua Xiao
78
0
0
26 Mar 2025
Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Yinan Sun
Xiongkuo Min
Zicheng Zhang
Yixuan Gao
Yuhang Cao
Guangtao Zhai
VLM
64
0
0
26 Mar 2025
Reasoning Beyond Limits: Advances and Open Problems for LLMs
Reasoning Beyond Limits: Advances and Open Problems for LLMs
M. Ferrag
Norbert Tihanyi
Merouane Debbah
ELM
OffRL
LRM
AI4CE
190
3
0
26 Mar 2025
OmniNova:A General Multimodal Agent Framework
OmniNova:A General Multimodal Agent Framework
Pengfei Du
LLMAG
47
0
0
25 Mar 2025
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Fucai Ke
Vijay Kumar B G
Xingjian Leng
Zhixi Cai
Zaid Khan
Weiqing Wang
P. D. Haghighi
H. Rezatofighi
Manmohan Chandraker
46
0
0
25 Mar 2025
RL-finetuning LLMs from on- and off-policy data with a single algorithm
RL-finetuning LLMs from on- and off-policy data with a single algorithm
Yunhao Tang
Taco Cohen
David W. Zhang
Michal Valko
Rémi Munos
OffRL
44
2
0
25 Mar 2025
Linguistic Blind Spots of Large Language Models
Linguistic Blind Spots of Large Language Models
Jiali Cheng
Hadi Amiri
60
1
0
25 Mar 2025
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Models Using Implicit Feedback from Pre-training Demonstrations
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Models Using Implicit Feedback from Pre-training Demonstrations
Ran Tian
Kratarth Goel
46
0
0
25 Mar 2025
FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
Dahyun Jung
Seungyoon Lee
Hyeonseok Moon
Chanjun Park
Heuiseok Lim
AAML
ALM
ELM
58
0
0
25 Mar 2025
Generative Linguistics, Large Language Models, and the Social Nature of Scientific Success
Generative Linguistics, Large Language Models, and the Social Nature of Scientific Success
Sophie Hao
ELM
AI4CE
56
0
0
25 Mar 2025
Previous
123...91011...143144145
Next