ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,395 papers shown
Title
Read to Play (R2-Play): Decision Transformer with Multimodal Game
  Instruction
Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction
Yonggang Jin
Ge Zhang
Hao Zhao
Tianyu Zheng
Jiawei Guo
Liuyu Xiang
Shawn Yue
Stephen W. Huang
Zhaofeng He
Jie Fu
OffRL
126
4
0
06 Feb 2024
Measuring Implicit Bias in Explicitly Unbiased Large Language Models
Measuring Implicit Bias in Explicitly Unbiased Large Language Models
Xuechunzi Bai
Angelina Wang
Ilia Sucholutsky
Thomas Griffiths
150
36
0
06 Feb 2024
Systematic Biases in LLM Simulations of Debates
Systematic Biases in LLM Simulations of Debates
Amir Taubenfeld
Yaniv Dover
Roi Reichart
Ariel Goldstein
74
59
0
06 Feb 2024
Discovery of the Hidden World with Large Language Models
Discovery of the Hidden World with Large Language Models
Chenxi Liu
Yongqiang Chen
Tongliang Liu
Biwei Huang
James Cheng
Bo Han
Kun Zhang
CML
121
13
0
06 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
  Closed-Source LLMs
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILMELMPILM
127
181
0
06 Feb 2024
Large Language Models to Enhance Bayesian Optimization
Large Language Models to Enhance Bayesian Optimization
Tennison Liu
Nicolás Astorga
Nabeel Seedat
M. Schaar
160
60
0
06 Feb 2024
Embedding Large Language Models into Extended Reality: Opportunities and
  Challenges for Inclusion, Engagement, and Privacy
Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and Privacy
Efe Bozkir
Suleyman Ozdel
Ka Hei Carrie Lau
Mengdi Wang
Hong Gao
Enkelejda Kasneci
109
27
0
06 Feb 2024
DistiLLM: Towards Streamlined Distillation for Large Language Models
DistiLLM: Towards Streamlined Distillation for Large Language Models
Jongwoo Ko
Sungnyun Kim
Tianyi Chen
SeYoung Yun
130
40
0
06 Feb 2024
ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse
  LLMs
ReLU2^22 Wins: Discovering Efficient Activation Functions for Sparse LLMs
Zhengyan Zhang
Yixin Song
Guanghui Yu
Xu Han
Yankai Lin
Chaojun Xiao
Chenyang Song
Zhiyuan Liu
Zeyu Mi
Maosong Sun
82
36
0
06 Feb 2024
Large Language Models As MOOCs Graders
Large Language Models As MOOCs Graders
Shahriar Golchin
Nikhil Garuda
Christopher Impey
Matthew Wenger
AI4Ed
31
5
0
06 Feb 2024
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Xiangxiang Chu
Limeng Qiao
Xinyu Zhang
Shuang Xu
Fei Wei
...
Xiaofei Sun
Yiming Hu
Xinyang Lin
Bo Zhang
Chunhua Shen
VLMMLLM
85
109
0
06 Feb 2024
INSIDE: LLMs' Internal States Retain the Power of Hallucination
  Detection
INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection
Chao Chen
Kai-Chun Liu
Ze Chen
Yi Gu
Yue-bo Wu
Mingyuan Tao
Zhihang Fu
Jieping Ye
HILM
138
111
0
06 Feb 2024
Personalized Language Modeling from Personalized Human Feedback
Personalized Language Modeling from Personalized Human Feedback
Xinyu Li
Zachary C. Lipton
Liu Leqi
ALM
139
59
0
06 Feb 2024
On the Emergence of Cross-Task Linearity in the Pretraining-Finetuning
  Paradigm
On the Emergence of Cross-Task Linearity in the Pretraining-Finetuning Paradigm
Zhanpeng Zhou
Zijun Chen
Yilan Chen
Bo Zhang
Junchi Yan
MoMe
115
11
0
06 Feb 2024
Learning to Generate Explainable Stock Predictions using Self-Reflective
  Large Language Models
Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models
Kelvin J.L. Koa
Yunshan Ma
Ritchie Ng
Tat-Seng Chua
AIFinLLMAG
108
31
0
06 Feb 2024
Professional Agents -- Evolving Large Language Models into Autonomous
  Experts with Human-Level Competencies
Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies
Zhixuan Chu
Yan Wang
Feng Zhu
Lu Yu
Longfei Li
Jinjie Gu
LLMAG
42
9
0
06 Feb 2024
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Pei Zhou
Jay Pujara
Xiang Ren
Xinyun Chen
Heng-Tze Cheng
Quoc V. Le
Ed H. Chi
Denny Zhou
Swaroop Mishra
Huaixiu Steven Zheng
LRMReLM
82
56
0
06 Feb 2024
RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal
  LLM Agents
RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents
Tomoyuki Kagaya
Thong Jing Yuan
Yuxuan Lou
J. Karlekar
Sugiri Pranata
Akira Kinose
Koki Oguri
Felix Wick
Yang You
LLMAG
100
36
0
06 Feb 2024
PreGIP: Watermarking the Pretraining of Graph Neural Networks for Deep Intellectual Property Protection
PreGIP: Watermarking the Pretraining of Graph Neural Networks for Deep Intellectual Property Protection
Enyan Dai
Min Lin
Suhang Wang
84
3
0
06 Feb 2024
Toward Human-AI Alignment in Large-Scale Multi-Player Games
Toward Human-AI Alignment in Large-Scale Multi-Player Games
Sugandha Sharma
Guy Davidson
Khimya Khetarpal
Anssi Kanervisto
Udit Arora
Katja Hofmann
Ida Momennejad
85
0
0
05 Feb 2024
Neural networks for abstraction and reasoning: Towards broad
  generalization in machines
Neural networks for abstraction and reasoning: Towards broad generalization in machines
Mikel Bober-Irizar
Soumya Banerjee
AI4CELRMNAI
92
10
0
05 Feb 2024
Nevermind: Instruction Override and Moderation in Large Language Models
Nevermind: Instruction Override and Moderation in Large Language Models
Edward Kim
ALM
44
1
0
05 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLMLRM
225
1,289
0
05 Feb 2024
Deal, or no deal (or who knows)? Forecasting Uncertainty in
  Conversations using Large Language Models
Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models
Anthony Sicilia
Hyunwoo J. Kim
Khyathi Chandu
Malihe Alikhani
Jack Hessel
60
2
0
05 Feb 2024
MobilityGPT: Enhanced Human Mobility Modeling with a GPT model
MobilityGPT: Enhanced Human Mobility Modeling with a GPT model
Ammar Haydari
Dongjie Chen
Zhengfeng Lai
Michael Zhang
Chen-Nee Chuah
147
10
0
05 Feb 2024
CIDAR: Culturally Relevant Instruction Dataset For Arabic
CIDAR: Culturally Relevant Instruction Dataset For Arabic
Zaid Alyafeai
Khalid Almubarak
Ahmed Ashraf
Deema Alnuhait
Saied Alshahrani
...
Qais Gawah
Zead Saleh
Mustafa Ghaleb
Yousef Ali
Maged S. Al-Shaibani
77
12
0
05 Feb 2024
Preference-Conditioned Language-Guided Abstraction
Preference-Conditioned Language-Guided Abstraction
Andi Peng
Andreea Bobu
Belinda Z. Li
T. Sumers
Ilia Sucholutsky
Nishanth Kumar
Thomas Griffiths
Julie A. Shah
83
13
0
05 Feb 2024
EasyInstruct: An Easy-to-use Instruction Processing Framework for Large
  Language Models
EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models
Yixin Ou
Ningyu Zhang
Honghao Gui
Ziwen Xu
Shuofei Qiao
...
Kangwei Liu
Lei Li
Zhen Bi
Guozhou Zheng
Huajun Chen
SyDa
104
0
0
05 Feb 2024
InteractiveVideo: User-Centric Controllable Video Generation with
  Synergistic Multimodal Instructions
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions
Yiyuan Zhang
Yuhao Kang
Zhixin Zhang
Xiaohan Ding
Sanyuan Zhao
Xiangyu Yue
VGen
100
4
0
05 Feb 2024
Decoding-time Realignment of Language Models
Decoding-time Realignment of Language Models
Tianlin Liu
Shangmin Guo
Leonardo Bianco
Daniele Calandriello
Quentin Berthet
Felipe Llinares-López
Jessica Hoffmann
Lucas Dixon
Michal Valko
Mathieu Blondel
AI4CE
124
46
0
05 Feb 2024
Delving into Multi-modal Multi-task Foundation Models for Road Scene
  Understanding: From Learning Paradigm Perspectives
Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm Perspectives
Sheng Luo
Wei Chen
Wanxin Tian
Rui Liu
Luanxuan Hou
...
Ling Shao
Yi Yang
Bojun Gao
Qun Li
Guobin Wu
141
17
0
05 Feb 2024
Kernel PCA for Out-of-Distribution Detection
Kernel PCA for Out-of-Distribution Detection
Kun Fang
Qinghua Tao
Kexin Lv
Mingzhen He
Xiaolin Huang
Jie Yang
OODD
128
4
0
05 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
160
35
0
05 Feb 2024
How do Large Language Models Learn In-Context? Query and Key Matrices of
  In-Context Heads are Two Towers for Metric Learning
How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning
Zeping Yu
Sophia Ananiadou
102
12
0
05 Feb 2024
DeAL: Decoding-time Alignment for Large Language Models
DeAL: Decoding-time Alignment for Large Language Models
James Y. Huang
Sailik Sengupta
Daniele Bonadiman
Yi-An Lai
Arshit Gupta
Nikolaos Pappas
Saab Mansour
Katrin Kirchoff
Dan Roth
131
36
0
05 Feb 2024
Adversarial Text Purification: A Large Language Model Approach for
  Defense
Adversarial Text Purification: A Large Language Model Approach for Defense
Raha Moraffah
Shubh Khandelwal
Amrita Bhattacharjee
Huan Liu
DeLMOAAML
97
5
0
05 Feb 2024
Vision-Language Models Provide Promptable Representations for
  Reinforcement Learning
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
William Chen
Oier Mees
Aviral Kumar
Sergey Levine
VLMLM&Ro
134
29
0
05 Feb 2024
Recursive Chain-of-Feedback Prevents Performance Degradation from
  Redundant Prompting
Recursive Chain-of-Feedback Prevents Performance Degradation from Redundant Prompting
Jinwoo Ahn
Kyuseung Shin
ReLMLRMAI4CE
80
1
0
05 Feb 2024
GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models
GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models
Haibo Jin
Ruoxi Chen
Peiyan Zhang
Andy Zhou
Yang Zhang
Haohan Wang
LLMAG
113
28
0
05 Feb 2024
Detecting Out-of-Distribution Objects through Class-Conditioned Inpainting
Detecting Out-of-Distribution Objects through Class-Conditioned Inpainting
Quang-Huy Nguyen
Jin Peng Zhou
Zhenzhen Liu
Khanh-Huyen Bui
Kilian Q. Weinberger
Wei-Lun Chao
Dung D. Le
84
0
0
05 Feb 2024
Governance of Generative Artificial Intelligence for Companies
Governance of Generative Artificial Intelligence for Companies
Johannes Schneider
Rene Abraham
Rene Abraham
Christian Meske
SILM
108
6
0
05 Feb 2024
GIRT-Model: Automated Generation of Issue Report Templates
GIRT-Model: Automated Generation of Issue Report Templates
Nafiseh Nikeghbal
Amir Hossein Kargaran
Abbas Heydarnoori
98
3
0
04 Feb 2024
Are Large Language Models Table-based Fact-Checkers?
Are Large Language Models Table-based Fact-Checkers?
Hangwen Zhang
Q. Si
Peng Fu
Zheng Lin
Weiping Wang
LRMLMTD
83
4
0
04 Feb 2024
Integration of cognitive tasks into artificial general intelligence test
  for large models
Integration of cognitive tasks into artificial general intelligence test for large models
Youzhi Qu
Chen Wei
Penghui Du
Wenxin Che
Chi Zhang
...
Bin Hu
Kai Du
Haiyan Wu
Jia Liu
Quanying Liu
ELM
66
10
0
04 Feb 2024
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural
  language generation from feedback
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Gaurav Pandey
Yatin Nandwani
Tahira Naseem
Mayank Mishra
Guangxuan Xu
Dinesh Raghu
Sachindra Joshi
Asim Munawar
Ramón Fernández Astudillo
BDL
68
4
0
04 Feb 2024
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement
  Learning with Diverse Human Feedback
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
Yifu Yuan
Jianye Hao
Yi-An Ma
Zibin Dong
Hebin Liang
Jinyi Liu
Zhixin Feng
Kai-Wen Zhao
Yan Zheng
OffRLALM
79
16
0
04 Feb 2024
Aligner: Efficient Alignment by Learning to Correct
Aligner: Efficient Alignment by Learning to Correct
Jiaming Ji
Boyuan Chen
Hantao Lou
Chongye Guo
Borong Zhang
Xuehai Pan
Juntao Dai
Tianyi Qiu
Yaodong Yang
148
40
0
04 Feb 2024
KICGPT: Large Language Model with Knowledge in Context for Knowledge
  Graph Completion
KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion
Yanbin Wei
Qiushi Huang
James T. Kwok
Yu Zhang
84
44
0
04 Feb 2024
A Survey on Robotics with Foundation Models: toward Embodied AI
A Survey on Robotics with Foundation Models: toward Embodied AI
Zhiyuan Xu
Kun Wu
Junjie Wen
Jinming Li
Ning Liu
Zhengping Che
Jian Tang
AI4CELRMLM&Ro
94
29
0
04 Feb 2024
Copyright Protection in Generative AI: A Technical Perspective
Copyright Protection in Generative AI: A Technical Perspective
Jie Ren
Han Xu
Pengfei He
Yingqian Cui
Shenglai Zeng
...
Hongzhi Wen
Jiayuan Ding
Hui Liu
Yi Chang
Jiliang Tang
DeLMO
106
43
0
04 Feb 2024
Previous
123...100101102...126127128
Next