ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,395 papers shown
Title
How Does Code Pretraining Affect Language Model Task Performance?
How Does Code Pretraining Affect Language Model Task Performance?
Jackson Petty
Sjoerd van Steenkiste
Tal Linzen
133
13
0
06 Sep 2024
On the Limited Generalization Capability of the Implicit Reward Model
  Induced by Direct Preference Optimization
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Yong Lin
Skyler Seto
Maartje ter Hoeve
Katherine Metcalf
B. Theobald
Xuan Wang
Yizhe Zhang
Chen Huang
Tong Zhang
107
15
0
05 Sep 2024
CogniDual Framework: Self-Training Large Language Models within a
  Dual-System Theoretical Framework for Improving Cognitive Tasks
CogniDual Framework: Self-Training Large Language Models within a Dual-System Theoretical Framework for Improving Cognitive Tasks
Yongxin Deng
Xihe Qiu
Jue Chen
Chao Qu
Jing Pan
Yuan Cheng
Yinghui Xu
Wei Chu
93
3
0
05 Sep 2024
Visual Prompting in Multimodal Large Language Models: A Survey
Visual Prompting in Multimodal Large Language Models: A Survey
Junda Wu
Zhehao Zhang
Yu Xia
Xintong Li
Zhaoyang Xia
...
Subrata Mitra
Dimitris N. Metaxas
Lina Yao
Jingbo Shang
Julian McAuley
VLMLRM
121
23
0
05 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language
  Models
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILMAAML
138
2
0
05 Sep 2024
Towards a Unified View of Preference Learning for Large Language Models:
  A Survey
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao
Feifan Song
Yibo Miao
Zefan Cai
Zhiyong Yang
...
Houfeng Wang
Zhifang Sui
Peiyi Wang
Baobao Chang
Baobao Chang
163
14
0
04 Sep 2024
Irrelevant Alternatives Bias Large Language Model Hiring Decisions
Irrelevant Alternatives Bias Large Language Model Hiring Decisions
Kremena Valkanova
Pencho Yordanov
73
1
0
04 Sep 2024
Do We Trust What They Say or What They Do? A Multimodal User Embedding
  Provides Personalized Explanations
Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations
Zhicheng Ren
Zhiping Xiao
Yizhou Sun
99
0
0
04 Sep 2024
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through
  Corpus Retrieval and Augmentation
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation
Ingo Ziegler
Abdullatif Köksal
Desmond Elliott
Hinrich Schütze
82
6
0
03 Sep 2024
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal
  Models
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
Bin Fu
Qiyang Wan
Jialin Li
Ruiping Wang
Xilin Chen
60
0
0
03 Sep 2024
Self-Instructed Derived Prompt Generation Meets In-Context Learning:
  Unlocking New Potential of Black-Box LLMs
Self-Instructed Derived Prompt Generation Meets In-Context Learning: Unlocking New Potential of Black-Box LLMs
Zhuo Li
Yuhao Du
Jinpeng Hu
Xiang Wan
Anningzhe Gao
73
2
0
03 Sep 2024
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Wei Chen
Zhen Huang
Liang Xie
Binbin Lin
Houqiang Li
...
Deng Cai
Yonggang Zhang
Wenxiao Wang
Xu Shen
Jieping Ye
152
10
0
03 Sep 2024
Imitating Language via Scalable Inverse Reinforcement Learning
Imitating Language via Scalable Inverse Reinforcement Learning
Markus Wulfmeier
Michael Bloesch
Nino Vieillard
Arun Ahuja
Jorg Bornschein
...
Jost Tobias Springenberg
Nikola Momchev
Olivier Bachem
Matthieu Geist
Martin Riedmiller
114
10
0
02 Sep 2024
Conversational Complexity for Assessing Risk in Large Language Models
Conversational Complexity for Assessing Risk in Large Language Models
John Burden
Manuel Cebrian
José Hernández-Orallo
105
2
0
02 Sep 2024
Recoverable Compression: A Multimodal Vision Token Recovery Mechanism
  Guided by Text Information
Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information
Yi Chen
Jian Xu
Xu-Yao Zhang
Wen-Zhuo Liu
Yang-Yang Liu
Cheng-Lin Liu
129
4
0
02 Sep 2024
SCOPE: Sign Language Contextual Processing with Embedding from LLMs
SCOPE: Sign Language Contextual Processing with Embedding from LLMs
Yuqi Liu
Wenqian Zhang
Sihan Ren
Chengyu Huang
Jingyi Yu
Lan Xu
SLR
125
0
0
02 Sep 2024
NYK-MS: A Well-annotated Multi-modal Metaphor and Sarcasm Understanding
  Benchmark on Cartoon-Caption Dataset
NYK-MS: A Well-annotated Multi-modal Metaphor and Sarcasm Understanding Benchmark on Cartoon-Caption Dataset
Ke Chang
Hao Li
Junzhao Zhang
Yunfang Wu
78
0
0
02 Sep 2024
Self-Judge: Selective Instruction Following with Alignment
  Self-Evaluation
Self-Judge: Selective Instruction Following with Alignment Self-Evaluation
Hai Ye
Hwee Tou Ng
ELMALM
56
5
0
02 Sep 2024
User-Driven Value Alignment: Understanding Users' Perceptions and
  Strategies for Addressing Biased and Discriminatory Statements in AI
  Companions
User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions
Xianzhe Fan
Qing Xiao
Xuhui Zhou
Jiaxin Pei
Maarten Sap
Zhicong Lu
Hong Shen
139
8
0
01 Sep 2024
The Dark Side of Human Feedback: Poisoning Large Language Models via
  User Inputs
The Dark Side of Human Feedback: Poisoning Large Language Models via User Inputs
Bocheng Chen
Hanqing Guo
Guangjing Wang
Yuanda Wang
Qiben Yan
AAML
106
5
0
01 Sep 2024
Diffusion Policy Policy Optimization
Diffusion Policy Policy Optimization
Allen Z. Ren
Justin Lidard
Lars L. Ankile
Anthony Simeonov
Pulkit Agrawal
Anirudha Majumdar
Benjamin Burchfiel
Hongkai Dai
Max Simchowitz
168
57
0
01 Sep 2024
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
Bang An
Sicheng Zhu
Ruiyi Zhang
Michael-Andrei Panaitescu-Liess
Yuancheng Xu
Furong Huang
AAML
149
18
0
01 Sep 2024
Enhancing Document-level Argument Extraction with Definition-augmented
  Heuristic-driven Prompting for LLMs
Enhancing Document-level Argument Extraction with Definition-augmented Heuristic-driven Prompting for LLMs
Tongyue Sun
Jiayi Xiao
93
0
0
30 Aug 2024
Enhancing Event Reasoning in Large Language Models through Instruction
  Fine-Tuning with Semantic Causal Graphs
Enhancing Event Reasoning in Large Language Models through Instruction Fine-Tuning with Semantic Causal Graphs
Mazal Bethany
Emet Bethany
Brandon Wherry
Cho-Yu Chiang
Nishant Vishwamitra
Anthony Rios
Peyman Najafirad
LRM
101
1
0
30 Aug 2024
Sequence to Sequence Reward Modeling: Improving RLHF by Language
  Feedback
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback
Jiayi Zhou
Yalan Qin
Juntao Dai
Yaodong Yang
154
8
0
30 Aug 2024
Tool-Assisted Agent on SQL Inspection and Refinement in Real-World
  Scenarios
Tool-Assisted Agent on SQL Inspection and Refinement in Real-World Scenarios
Zhongyuan Wang
Richong Zhang
Zhijie Nie
Jaein Kim
79
2
0
30 Aug 2024
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning
  of Large Language Models
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models
Alec Solway
ALM
95
0
0
29 Aug 2024
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad
  Examples in Language Models
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
Yi-Lin Tuan
William Yang Wang
98
1
0
29 Aug 2024
Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak
  Attacks
Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks
Tom Gibbs
Ethan Kosak-Hine
George Ingebretsen
Jason Zhang
Julius Broomfield
Sara Pieri
Reihaneh Iranmanesh
Reihaneh Rabbany
Kellin Pelrine
AAML
80
7
0
29 Aug 2024
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths
  Vision Computation
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation
Shiwei Wu
Joya Chen
Kevin Qinghong Lin
Qimeng Wang
Yan Gao
Qianli Xu
Tong Xu
Yao Hu
Enhong Chen
Mike Zheng Shou
VLM
86
14
0
29 Aug 2024
Iterative Graph Alignment
Iterative Graph Alignment
Fangyuan Yu
H. S. Arora
Matt Johnson
83
2
0
29 Aug 2024
RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model
RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model
Zhuan Shi
Jing Yan
Xiaoli Tang
Lingjuan Lyu
Boi Faltings
78
1
0
29 Aug 2024
Self-Alignment: Improving Alignment of Cultural Values in LLMs via
  In-Context Learning
Self-Alignment: Improving Alignment of Cultural Values in LLMs via In-Context Learning
Rochelle Choenni
Ekaterina Shutova
120
12
0
29 Aug 2024
ChatSUMO: Large Language Model for Automating Traffic Scenario
  Generation in Simulation of Urban MObility
ChatSUMO: Large Language Model for Automating Traffic Scenario Generation in Simulation of Urban MObility
Shuyang Li
Talha Azfar
Ruimin Ke
LLMAG
122
16
0
29 Aug 2024
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Xin Zheng
Jie Lou
Boxi Cao
Xueru Wen
Yuqiu Ji
Hongyu Lin
Yaojie Lu
Xianpei Han
Debing Zhang
Le Sun
OffRLLRMLLMAGReLMKELM
134
14
1
29 Aug 2024
EPO: Hierarchical LLM Agents with Environment Preference Optimization
EPO: Hierarchical LLM Agents with Environment Preference Optimization
Qi Zhao
Haotian Fu
Chen Sun
George Konidaris
106
11
0
28 Aug 2024
Knowledge Navigator: LLM-guided Browsing Framework for Exploratory
  Search in Scientific Literature
Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature
Uri Katz
Mosh Levy
Yoav Goldberg
61
5
0
28 Aug 2024
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language
  Models
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Jiayi Gui
Yiming Liu
Jiale Cheng
Xiaotao Gu
Xiao-Yang Liu
Hongning Wang
Yuxiao Dong
Jie Tang
Minlie Huang
ELMLLMAGLRM
97
7
0
28 Aug 2024
An Extremely Data-efficient and Generative LLM-based Reinforcement
  Learning Agent for Recommenders
An Extremely Data-efficient and Generative LLM-based Reinforcement Learning Agent for Recommenders
Shuang Feng
Grace Feng
OffRL
66
2
0
28 Aug 2024
CBF-LLM: Safe Control for LLM Alignment
CBF-LLM: Safe Control for LLM Alignment
Yuya Miyaoka
Masaki Inoue
64
2
0
28 Aug 2024
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large
  Language Models
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models
Dian Yu
Baolin Peng
Ye Tian
Linfeng Song
Haitao Mi
Dong Yu
ALMLRM
78
3
0
28 Aug 2024
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation
  Strategy of Consistency Model
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model
Lifan Jiang
Zhihui Wang
Siqi Yin
Guangxiao Ma
Peng Zhang
Boxi Wu
DiffM
154
0
0
28 Aug 2024
An Investigation of Warning Erroneous Chat Translations in Cross-lingual
  Communication
An Investigation of Warning Erroneous Chat Translations in Cross-lingual Communication
Yunmeng Li
Jun Suzuki
Makoto Morishita
Kaori Abe
Kentaro Inui
123
1
0
28 Aug 2024
Intertwined Biases Across Social Media Spheres: Unpacking Correlations
  in Media Bias Dimensions
Intertwined Biases Across Social Media Spheres: Unpacking Correlations in Media Bias Dimensions
Yifan Liu
Yike Li
Dong Wang
74
0
0
27 Aug 2024
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li
Ziwen Han
Ian Steneker
Willow Primack
Riley Goodside
Hugh Zhang
Zifan Wang
Cristina Menghini
Summer Yue
AAMLMU
113
57
0
27 Aug 2024
Evaluating Stability of Unreflective Alignment
Evaluating Stability of Unreflective Alignment
James Lucassen
Mark Henry
Philippa Wright
Owen Yeung
71
0
0
27 Aug 2024
Zero-Shot Visual Reasoning by Vision-Language Models: Benchmarking and
  Analysis
Zero-Shot Visual Reasoning by Vision-Language Models: Benchmarking and Analysis
Aishik Nagar
Shantanu Jaiswal
Cheston Tan
ReLMLRM
65
12
0
27 Aug 2024
Negation Blindness in Large Language Models: Unveiling the NO Syndrome
  in Image Generation
Negation Blindness in Large Language Models: Unveiling the NO Syndrome in Image Generation
Mohammad Nadeem
S. Sohail
Min Zhang
Björn W. Schuller
Amir Hussain
84
4
0
27 Aug 2024
Cross-Modal Learning for Chemistry Property Prediction: Large Language
  Models Meet Graph Machine Learning
Cross-Modal Learning for Chemistry Property Prediction: Large Language Models Meet Graph Machine Learning
Sakhinana Sagar Srinivas
Venkataramana Runkana
AI4CE
82
2
0
27 Aug 2024
Advancing Adversarial Suffix Transfer Learning on Aligned Large Language
  Models
Advancing Adversarial Suffix Transfer Learning on Aligned Large Language Models
Hongfu Liu
Yuxi Xie
Ye Wang
Michael Shieh
140
3
0
27 Aug 2024
Previous
123...525354...126127128
Next