ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLM
    ALM
ArXivPDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 7,310 papers shown
Title
Enhancing EEG-to-Text Decoding through Transferable Representations from
  Pre-trained Contrastive EEG-Text Masked Autoencoder
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder
Jiaqi Wang
Zhenxi Song
Zhengyu Ma
Xipeng Qiu
Min Zhang
Zhiguo Zhang
44
5
0
27 Feb 2024
Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective
Fufangchen Zhao
Guoqiang Jin
Jiaheng Huang
Rui Zhao
Fei Tan
38
1
0
27 Feb 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
Xinyu Lu
Bowen Yu
Yaojie Lu
Hongyu Lin
Haiyang Yu
Le Sun
Xianpei Han
Yongbin Li
70
13
0
27 Feb 2024
RECOST: External Knowledge Guided Data-efficient Instruction Tuning
RECOST: External Knowledge Guided Data-efficient Instruction Tuning
Qi Zhang
Yiming Zhang
Haobo Wang
Junbo Zhao
60
11
0
27 Feb 2024
Speak Out of Turn: Safety Vulnerability of Large Language Models in
  Multi-turn Dialogue
Speak Out of Turn: Safety Vulnerability of Large Language Models in Multi-turn Dialogue
Zhenhong Zhou
Jiuyang Xiang
Haopeng Chen
Quan Liu
Zherui Li
Sen Su
37
20
0
27 Feb 2024
RIME: Robust Preference-based Reinforcement Learning with Noisy
  Preferences
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng
Gang Xiong
Xingyuan Dai
Qinghai Miao
Yisheng Lv
Fei-Yue Wang
35
15
0
27 Feb 2024
Beyond the Known: Investigating LLMs Performance on Out-of-Domain Intent
  Detection
Beyond the Known: Investigating LLMs Performance on Out-of-Domain Intent Detection
Pei Wang
Keqing He
Yejie Wang
Xiaoshuai Song
Yutao Mou
Jingang Wang
Yunsen Xian
Xunliang Cai
Weiran Xu
35
6
0
27 Feb 2024
Stochastic Gradient Succeeds for Bandits
Stochastic Gradient Succeeds for Bandits
Jincheng Mei
Zixin Zhong
Bo Dai
Alekh Agarwal
Csaba Szepesvári
Dale Schuurmans
40
1
0
27 Feb 2024
Measuring Vision-Language STEM Skills of Neural Models
Measuring Vision-Language STEM Skills of Neural Models
Jianhao Shen
Ye Yuan
Srbuhi Mirzoyan
Ming Zhang
Chenguang Wang
VLM
35
8
0
27 Feb 2024
When Scaling Meets LLM Finetuning: The Effect of Data, Model and
  Finetuning Method
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Biao Zhang
Zhongtao Liu
Colin Cherry
Orhan Firat
LRM
63
128
0
27 Feb 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
80
263
0
27 Feb 2024
Video as the New Language for Real-World Decision Making
Video as the New Language for Real-World Decision Making
Sherry Yang
Jacob Walker
Jack Parker-Holder
Yilun Du
Jake Bruce
Andre Barreto
Pieter Abbeel
Dale Schuurmans
VGen
36
46
0
27 Feb 2024
OSCaR: Object State Captioning and State Change Representation
OSCaR: Object State Captioning and State Change Representation
Nguyen Nguyen
Jing Bi
A. Vosoughi
Yapeng Tian
Pooyan Fazli
Chenliang Xu
48
8
0
27 Feb 2024
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large
  Language Models
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
Xinran Zhao
Hongming Zhang
Xiaoman Pan
Wenlin Yao
Dong Yu
Tongshuang Wu
Jianshu Chen
HILM
LRM
35
4
0
27 Feb 2024
Benchmarking LLMs on the Semantic Overlap Summarization Task
Benchmarking LLMs on the Semantic Overlap Summarization Task
John Salvador
Naman Bansal
Mousumi Akter
Souvik Sarkar
Anupam Das
S. Karmaker
47
2
0
26 Feb 2024
A Survey of Large Language Models in Cybersecurity
A Survey of Large Language Models in Cybersecurity
Gabriel de Jesus Coelho da Silva
Carlos Becker Westphall
40
6
0
26 Feb 2024
Set the Clock: Temporal Alignment of Pretrained Language Models
Set the Clock: Temporal Alignment of Pretrained Language Models
Bowen Zhao
Zander Brumbaugh
Yizhong Wang
Hanna Hajishirzi
Noah A. Smith
CLL
KELM
41
11
0
26 Feb 2024
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations
  for Values and Opinions in Large Language Models
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
Paul Röttger
Valentin Hofmann
Valentina Pyatkin
Musashi Hinck
Hannah Rose Kirk
Hinrich Schütze
Dirk Hovy
ELM
26
53
0
26 Feb 2024
A Comprehensive Evaluation of Quantization Strategies for Large Language
  Models
A Comprehensive Evaluation of Quantization Strategies for Large Language Models
Renren Jin
Jiangcun Du
Wuwei Huang
Wei Liu
Jian Luan
Bin Wang
Deyi Xiong
MQ
32
31
0
26 Feb 2024
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large
  Language Models
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models
Huijie Lv
Xiao Wang
Yuan Zhang
Caishuang Huang
Shihan Dou
Junjie Ye
Tao Gui
Qi Zhang
Xuanjing Huang
AAML
44
29
0
26 Feb 2024
Navigating Complexity: Orchestrated Problem Solving with Multi-Agent
  LLMs
Navigating Complexity: Orchestrated Problem Solving with Multi-Agent LLMs
Sumedh Rasal
E. Hauer
32
0
0
26 Feb 2024
Look Before You Leap: Towards Decision-Aware and Generalizable
  Tool-Usage for Large Language Models
Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models
Anchun Gui
Jian Li
Yong Dai
Nan Du
Han Xiao
41
1
0
26 Feb 2024
LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs
  from the Programming Language
LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs from the Programming Language
Ming Wang
Yuanzhong Liu
Xiaoyu Liang
Songlian Li
Yijie Huang
...
Shi Feng
Chi Zhang
Yifei Zhang
Minghui Zheng
Jigang Li
46
13
0
26 Feb 2024
Long-Context Language Modeling with Parallel Context Encoding
Long-Context Language Modeling with Parallel Context Encoding
Howard Yen
Tianyu Gao
Danqi Chen
40
43
0
26 Feb 2024
PAQA: Toward ProActive Open-Retrieval Question Answering
PAQA: Toward ProActive Open-Retrieval Question Answering
Pierre Erbacher
Jian-Yun Nie
P. Preux
Laure Soulier
RALM
29
2
0
26 Feb 2024
Rethinking Negative Instances for Generative Named Entity Recognition
Rethinking Negative Instances for Generative Named Entity Recognition
Yuyang Ding
Juntao Li
Pinzheng Wang
Zecheng Tang
Bowen Yan
Min Zhang
52
10
0
26 Feb 2024
mEdIT: Multilingual Text Editing via Instruction Tuning
mEdIT: Multilingual Text Editing via Instruction Tuning
Vipul Raheja
Dimitris Alikaniotis
Vivek Kulkarni
Bashar Alhafni
Dhruv Kumar
VLM
38
6
0
26 Feb 2024
RoCoIns: Enhancing Robustness of Large Language Models through
  Code-Style Instructions
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Yuan Zhang
Xiao Wang
Zhiheng Xi
Han Xia
Tao Gui
Qi Zhang
Xuanjing Huang
53
3
0
26 Feb 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
61
82
0
26 Feb 2024
Feedback Efficient Online Fine-Tuning of Diffusion Models
Feedback Efficient Online Fine-Tuning of Diffusion Models
Masatoshi Uehara
Yulai Zhao
Kevin Black
Ehsan Hajiramezanali
Gabriele Scalia
N. Diamant
Alex Tseng
Sergey Levine
Tommaso Biancalani
39
22
0
26 Feb 2024
CodeS: Towards Building Open-source Language Models for Text-to-SQL
CodeS: Towards Building Open-source Language Models for Text-to-SQL
Haoyang Li
Jing Zhang
Hanbing Liu
Ju Fan
Xiaokang Zhang
Jun Zhu
Renjie Wei
Hongyan Pan
Cuiping Li
Hong Chen
ELM
AI4TS
45
98
0
26 Feb 2024
Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based
  Question Answering
Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering
Mingxu Tao
Dongyan Zhao
Yansong Feng
LLMAG
49
3
0
26 Feb 2024
From Large Language Models and Optimization to Decision Optimization
  CoPilot: A Research Manifesto
From Large Language Models and Optimization to Decision Optimization CoPilot: A Research Manifesto
Segev Wasserkrug
Léonard Boussioux
D. Hertog
F. Mirzazadeh
Ilker Birbil
Jannis Kurtz
Donato Maragno
LLMAG
51
3
0
26 Feb 2024
GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning
GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning
Han Zou
Qiyang Zhao
Lina Bariah
Yu Tian
M. Bennis
S. Lasaulce
101
12
0
26 Feb 2024
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination
  Tendency of LLMs
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs
Cem Uluoglakci
T. Taşkaya-Temizel
HILM
35
2
0
25 Feb 2024
Defending Large Language Models against Jailbreak Attacks via Semantic
  Smoothing
Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Jiabao Ji
Bairu Hou
Alexander Robey
George J. Pappas
Hamed Hassani
Yang Zhang
Eric Wong
Shiyu Chang
AAML
50
43
0
25 Feb 2024
No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design
  Choices
No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices
Qi Pang
Shengyuan Hu
Wenting Zheng
Virginia Smith
WaLM
56
12
0
25 Feb 2024
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM
  Jailbreakers
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Xirui Li
Ruochen Wang
Minhao Cheng
Tianyi Zhou
Cho-Jui Hsieh
AAML
52
37
0
25 Feb 2024
PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization
PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization
Xiangdi Meng
Damai Dai
Weiyao Luo
Zhe Yang
Shaoxiang Wu
Xiaochen Wang
Peiyi Wang
Qingxiu Dong
Liang Chen
Zhifang Sui
114
11
0
25 Feb 2024
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D
  Talking Face Generation
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation
Yasheng Sun
Wenqing Chu
Hang Zhou
Kaisiyuan Wang
Hideki Koike
42
5
0
25 Feb 2024
InstructEdit: Instruction-based Knowledge Editing for Large Language
  Models
InstructEdit: Instruction-based Knowledge Editing for Large Language Models
Ningyu Zhang
Bo Tian
Siyuan Cheng
Xiaozhuan Liang
Yi Hu
Kouying Xue
Yanjie Gou
Xi Chen
Huajun Chen
KELM
62
4
0
25 Feb 2024
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Yao Mu
Junting Chen
Qinglong Zhang
Shoufa Chen
Qiaojun Yu
...
Wenhai Wang
Jifeng Dai
Yu Qiao
Mingyu Ding
Ping Luo
46
22
0
25 Feb 2024
Evaluating Robustness of Generative Search Engine on Adversarial Factual
  Questions
Evaluating Robustness of Generative Search Engine on Adversarial Factual Questions
Xuming Hu
Xiaochuan Li
Junzhe Chen
Hai-Tao Zheng
Yangning Li
...
Yasheng Wang
Qun Liu
Lijie Wen
Philip S. Yu
Zhijiang Guo
AAML
ELM
35
5
0
25 Feb 2024
Say More with Less: Understanding Prompt Learning Behaviors through Gist
  Compression
Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression
Xinze Li
Zhenghao Liu
Chenyan Xiong
Shi Yu
Yukun Yan
Shuo Wang
Ge Yu
VLM
43
4
0
25 Feb 2024
Detecting Machine-Generated Texts by Multi-Population Aware Optimization
  for Maximum Mean Discrepancy
Detecting Machine-Generated Texts by Multi-Population Aware Optimization for Maximum Mean Discrepancy
Shuhai Zhang
Yiliao Song
Jiahao Yang
Yuanqing Li
Bo Han
Mingkui Tan
DeLMO
42
5
0
25 Feb 2024
Don't Forget Your Reward Values: Language Model Alignment via
  Value-based Calibration
Don't Forget Your Reward Values: Language Model Alignment via Value-based Calibration
Xin Mao
Fengming Li
Huimin Xu
Wei Zhang
A. Luu
ALM
45
6
0
25 Feb 2024
GraphWiz: An Instruction-Following Language Model for Graph Problems
GraphWiz: An Instruction-Following Language Model for Graph Problems
Nuo Chen
Yuhan Li
Jianheng Tang
Jia Li
47
28
0
25 Feb 2024
ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate
  Suffix Embeddings
ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings
Hao Wang
Hao Li
Minlie Huang
Lei Sha
AAML
48
12
0
25 Feb 2024
Rethinking Software Engineering in the Foundation Model Era: A Curated
  Catalogue of Challenges in the Development of Trustworthy FMware
Rethinking Software Engineering in the Foundation Model Era: A Curated Catalogue of Challenges in the Development of Trustworthy FMware
Ahmed E. Hassan
Dayi Lin
Gopi Krishnan Rajbahadur
Keheliya Gallaba
F. Côgo
...
Kishanthan Thangarajah
G. Oliva
Jiahuei Lin
Wali Mohammad Abdullah
Zhen Ming Jiang
37
7
0
25 Feb 2024
Citation-Enhanced Generation for LLM-based Chatbots
Citation-Enhanced Generation for LLM-based Chatbots
Weitao Li
Junkai Li
Weizhi Ma
Yang Liu
71
18
0
25 Feb 2024
Previous
123...838485...145146147
Next