ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,392 papers shown
Title
DolphCoder: Echo-Locating Code Large Language Models with Diverse and
  Multi-Objective Instruction Tuning
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning
Yejie Wang
Keqing He
Guanting Dong
Pei Wang
Weihao Zeng
...
Yutao Mou
Mengdi Zhang
Jingang Wang
Xunliang Cai
Weiran Xu
ALM
84
11
0
14 Feb 2024
Exploring the Adversarial Capabilities of Large Language Models
Exploring the Adversarial Capabilities of Large Language Models
Lukas Struppek
Minh Hieu Le
Dominik Hintersdorf
Kristian Kersting
ELMAAML
71
4
0
14 Feb 2024
AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous
  Systems
AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous Systems
Jiaying Lu
Bo Pan
Jieyi Chen
Yingchaojie Feng
Jingyuan Hu
Yuchen Peng
Wei Chen
67
15
0
14 Feb 2024
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware
  Decoding
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Jinyuan Jia
Bill Yuchen Lin
Radha Poovendran
AAML
194
111
0
14 Feb 2024
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
Siwei Yang
Bingchen Zhao
Cihang Xie
LRM
59
6
0
14 Feb 2024
How Secure Are Large Language Models (LLMs) for Navigation in Urban Environments?
How Secure Are Large Language Models (LLMs) for Navigation in Urban Environments?
Congcong Wen
Jiazhao Liang
Shuaihang Yuan
Hao Huang
Geeta Chandra Raju Bethala
Yu-Shen Liu
Mengyu Wang
Anthony Tzes
Yi Fang
AAML
95
6
0
14 Feb 2024
Reinforcement Learning from Human Feedback with Active Queries
Reinforcement Learning from Human Feedback with Active Queries
Kaixuan Ji
Jiafan He
Quanquan Gu
105
19
0
14 Feb 2024
Rethinking Machine Unlearning for Large Language Models
Rethinking Machine Unlearning for Large Language Models
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
...
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILawMU
191
120
0
13 Feb 2024
InstructGraph: Boosting Large Language Models via Graph-centric
  Instruction Tuning and Preference Alignment
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment
Jianing Wang
Junda Wu
Yupeng Hou
Yao Liu
Ming Gao
Julian McAuley
96
35
0
13 Feb 2024
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and
  Local Refinements
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alex Havrilla
Sharath Raparthy
Christoforus Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Roberta Railneau
ReLMLRM
98
65
0
13 Feb 2024
Measuring and Controlling Instruction (In)Stability in Language Model
  Dialogs
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs
Kenneth Li
Tianle Liu
Naomi Bashkansky
David Bau
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
106
12
0
13 Feb 2024
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xing-ming Guo
Fangxu Yu
Huan Zhang
Lianhui Qin
Bin Hu
AAML
188
92
0
13 Feb 2024
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward
  Finetuning of Diffusion Models
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models
Fei Deng
Qifei Wang
Wei Wei
Matthias Grundmann
Tingbo Hou
EGVM
84
21
0
13 Feb 2024
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM
  Agents Exponentially Fast
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Ye Wang
Jing Jiang
Min Lin
LLMAGLM&Ro
52
63
0
13 Feb 2024
Large Language Models for the Automated Analysis of Optimization
  Algorithms
Large Language Models for the Automated Analysis of Optimization Algorithms
Camilo Chacón Sartori
Christian Blum
Gabriela Ochoa
104
3
0
13 Feb 2024
Visually Dehallucinative Instruction Generation
Visually Dehallucinative Instruction Generation
Sungguk Cha
Jusung Lee
Younghyun Lee
Cheoljong Yang
MLLM
51
6
0
13 Feb 2024
Towards Faithful and Robust LLM Specialists for Evidence-Based
  Question-Answering
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
Tobias Schimanski
Jingwei Ni
Mathias Kraus
Elliott Ash
Markus Leippold
70
4
0
13 Feb 2024
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Shentao Yang
Tianqi Chen
Mingyuan Zhou
EGVM
126
30
0
13 Feb 2024
Active Preference Learning for Large Language Models
Active Preference Learning for Large Language Models
William Muldrew
Peter Hayes
Mingtian Zhang
David Barber
89
24
0
12 Feb 2024
Relative Preference Optimization: Enhancing LLM Alignment through
  Contrasting Responses across Identical and Diverse Prompts
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts
Yueqin Yin
Zhendong Wang
Yi Gu
Hai Huang
Weizhu Chen
Mingyuan Zhou
75
19
0
12 Feb 2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model
  on 100K hours of data
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Mateusz Lajszczak
Guillermo Cámbara
Yang Li
Fatih Beyhan
Arent van Korlaar
...
Bartosz Putrycz
Soledad López Gambino
Kayeon Yoo
Elena Sokolova
Thomas Drugman
LM&MA
113
88
0
12 Feb 2024
Large Language Models as Agents in Two-Player Games
Large Language Models as Agents in Two-Player Games
Yang Liu
Peng Sun
Hang Li
LLMAG
80
4
0
12 Feb 2024
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned
  Language Models
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Siddharth Karamcheti
Suraj Nair
Ashwin Balakrishna
Percy Liang
Thomas Kollar
Dorsa Sadigh
MLLMVLM
126
132
0
12 Feb 2024
Aya Model: An Instruction Finetuned Open-Access Multilingual Language
  Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Ahmet Üstün
Viraat Aryabumi
Zheng-Xin Yong
Wei-Yin Ko
Daniel D'souza
...
Shayne Longpre
Niklas Muennighoff
Marzieh Fadaee
Julia Kreutzer
Sara Hooker
ALMELMSyDaLRM
100
231
0
12 Feb 2024
Towards Unified Alignment Between Agents, Humans, and Environment
Towards Unified Alignment Between Agents, Humans, and Environment
Zonghan Yang
An Liu
Zijun Liu
Wenbing Huang
Fangzhou Xiong
...
Zhenhe Zhang
Ziyue Wang
Zhicheng Guo
Peng Li
Yang Liu
60
4
0
12 Feb 2024
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Haoyu Wang
Guozheng Ma
Ziqiao Meng
Zeyu Qin
Li Shen
...
Liu Liu
Yatao Bian
Tingyang Xu
Xueqian Wang
Peilin Zhao
117
16
0
12 Feb 2024
MAFIA: Multi-Adapter Fused Inclusive LanguAge Models
MAFIA: Multi-Adapter Fused Inclusive LanguAge Models
Prachi Jain
Ashutosh Sathe
Varun Gumma
Kabir Ahuja
Sunayana Sitaram
121
1
0
12 Feb 2024
VisLingInstruct: Elevating Zero-Shot Learning in Multi-Modal Language
  Models with Autonomous Instruction Optimization
VisLingInstruct: Elevating Zero-Shot Learning in Multi-Modal Language Models with Autonomous Instruction Optimization
Dongsheng Zhu
Xunzhu Tang
Weidong Han
Jinghui Lu
Yukun Zhao
Guoliang Xing
Junfeng Wang
D. Yin
VLMMLLM
90
10
0
12 Feb 2024
Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy
  Induction from Limited Examples
Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples
Qingkai Zeng
Yuyang Bai
Zhaoxuan Tan
Shangbin Feng
Zhenwen Liang
Zhihan Zhang
Meng Jiang
AI4CE
81
16
0
12 Feb 2024
ODIN: Disentangled Reward Mitigates Hacking in RLHF
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Lichang Chen
Chen Zhu
Davit Soselia
Jiuhai Chen
Dinesh Manocha
Tom Goldstein
Heng-Chiao Huang
Mohammad Shoeybi
Bryan Catanzaro
AAML
116
66
0
11 Feb 2024
Online Iterative Reinforcement Learning from Human Feedback with General
  Preference Model
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
Chen Ye
Wei Xiong
Yuheng Zhang
Nan Jiang
Tong Zhang
OffRL
88
15
0
11 Feb 2024
How do Large Language Models Navigate Conflicts between Honesty and
  Helpfulness?
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
Ryan Liu
T. Sumers
Ishita Dasgupta
Thomas Griffiths
LLMAG
76
17
0
11 Feb 2024
CPSDBench: A Large Language Model Evaluation Benchmark and Baseline for
  Chinese Public Security Domain
CPSDBench: A Large Language Model Evaluation Benchmark and Baseline for Chinese Public Security Domain
Xin Tong
Bo Jin
Zhi Lin
Binjun Wang
Ting Yu
Qiang Cheng
ELM
91
0
0
11 Feb 2024
TransGPT: Multi-modal Generative Pre-trained Transformer for
  Transportation
TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation
Peng Wang
Xiang Wei
Fangxu Hu
Wenjuan Han
93
21
0
11 Feb 2024
Natural Language Reinforcement Learning
Natural Language Reinforcement Learning
Xidong Feng
Bo Liu
Mengyue Yang
Ziyan Wang
Girish A. Koushiks
Yali Du
Ying Wen
Jun Wang
OffRL
106
5
0
11 Feb 2024
OpenFedLLM: Training Large Language Models on Decentralized Private Data
  via Federated Learning
OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning
Rui Ye
Wenhao Wang
Jingyi Chai
Dihan Li
Zexi Li
Yinda Xu
Yaxin Du
Yanfeng Wang
Siheng Chen
ALMFedMLAIFin
101
98
0
10 Feb 2024
GenTranslate: Large Language Models are Generative Multilingual Speech
  and Machine Translators
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Dong Zhang
Zhehuai Chen
Eng Siong Chng
91
21
0
10 Feb 2024
Principled Penalty-based Methods for Bilevel Reinforcement Learning and
  RLHF
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen
Zhuoran Yang
Tianyi Chen
OffRL
112
15
0
10 Feb 2024
Retrosynthesis Prediction via Search in (Hyper) Graph
Retrosynthesis Prediction via Search in (Hyper) Graph
Zixun Lan
Binjie Hong
Jiajun Zhu
Zuo Zeng
Zhenfu Liu
Limin Yu
Fei Ma
69
0
0
09 Feb 2024
GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph
  Alignment via Neighborhood Partitioning and Generative Subgraph Encoding
GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding
Stefan Dernbach
Khushbu Agarwal
Alejandro Zuniga
Michael Henry
Sutanay Choudhury
107
10
0
09 Feb 2024
Corruption Robust Offline Reinforcement Learning with Human Feedback
Corruption Robust Offline Reinforcement Learning with Human Feedback
Debmalya Mandal
Andi Nika
Parameswaran Kamalaruban
Adish Singla
Goran Radanović
OffRL
95
11
0
09 Feb 2024
NICE: To Optimize In-Context Examples or Not?
NICE: To Optimize In-Context Examples or Not?
Pragya Srivastava
Satvik Golechha
Amit Deshpande
Amit Sharma
83
6
0
09 Feb 2024
Scalable Interactive Machine Learning for Future Command and Control
Scalable Interactive Machine Learning for Future Command and Control
Anna Madison
Ellen R. Novoseller
Vinicius G. Goecks
Benjamin T. Files
Nicholas R. Waytowich
Alfred Yu
Vernon J. Lawhern
Steven Thurman
Christopher Kelshaw
Kaleb McDowell
70
4
0
09 Feb 2024
V-STaR: Training Verifiers for Self-Taught Reasoners
V-STaR: Training Verifiers for Self-Taught Reasoners
Arian Hosseini
Xingdi Yuan
Nikolay Malkin
Rameswar Panda
Alessandro Sordoni
Rishabh Agarwal
ReLMLRM
118
137
0
09 Feb 2024
StruQ: Defending Against Prompt Injection with Structured Queries
StruQ: Defending Against Prompt Injection with Structured Queries
Sizhe Chen
Julien Piet
Chawin Sitawarin
David Wagner
SILMAAML
102
89
0
09 Feb 2024
ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs
ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs
Fernando Ferraretto
Thiago Laitz
R. Lotufo
Rodrigo Nogueira
LRM
50
1
0
09 Feb 2024
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to
  Support Art Appreciation Education
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education
Unggi Lee
Minji Jeon
Yunseo Lee
Gyuri Byun
Yoorim Son
Jaeyoon Shin
Hongkyu Ko
Hyeoncheol Kim
67
9
0
09 Feb 2024
Fight Back Against Jailbreaking via Prompt Adversarial Tuning
Fight Back Against Jailbreaking via Prompt Adversarial Tuning
Yichuan Mo
Yuji Wang
Zeming Wei
Yisen Wang
AAMLSILM
100
32
0
09 Feb 2024
Entropy-Regularized Token-Level Policy Optimization for Language Agent
  Reinforcement
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement
Muning Wen
Junwei Liao
Cheng Deng
Jun Wang
Weinan Zhang
Ying Wen
92
3
0
09 Feb 2024
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not
  Evaluate
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
Juhyun Oh
Eunsu Kim
Inha Cha
Alice Oh
ELM
94
9
0
09 Feb 2024
Previous
123...9899100...126127128
Next