ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,381 papers shown
Title
JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context
  and Dynamics of Human Interactions Within Social Groups
JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups
Simindokht Jahangard
Zhixi Cai
Shiki Wen
Hamid Rezatofighi
56
6
0
06 Apr 2024
Binary Classifier Optimization for Large Language Model Alignment
Binary Classifier Optimization for Large Language Model Alignment
Seungjae Jung
Gunsoo Han
D. W. Nam
Kyoung-Woon On
82
25
0
06 Apr 2024
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
179
403
0
06 Apr 2024
Bayesian Additive Regression Networks
Bayesian Additive Regression Networks
D. V. Boxel
UQCVBDL
82
0
0
05 Apr 2024
Prompt Public Large Language Models to Synthesize Data for Private
  On-device Applications
Prompt Public Large Language Models to Synthesize Data for Private On-device Applications
Shanshan Wu
Zheng Xu
Yanxiang Zhang
Yuanbo Zhang
Daniel Ramage
SyDa
88
12
0
05 Apr 2024
Pixel-wise RL on Diffusion Models: Reinforcement Learning from Rich
  Feedback
Pixel-wise RL on Diffusion Models: Reinforcement Learning from Rich Feedback
Mo Kordzanganeh
Danial Keshvary
Nariman Arian
EGVM
52
0
0
05 Apr 2024
Scope Ambiguities in Large Language Models
Scope Ambiguities in Large Language Models
Gaurav Kamath
Sebastian Schuster
Sowmya Vajjala
Siva Reddy
69
6
0
05 Apr 2024
Social Skill Training with Large Language Models
Social Skill Training with Large Language Models
Diyi Yang
Caleb Ziems
William B. Held
Omar Shaikh
Michael S. Bernstein
John C. Mitchell
LLMAG
80
11
0
05 Apr 2024
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
Xinyu Ma
Xu Chu
Zhibang Yang
Yang Lin
Xin Gao
Junfeng Zhao
96
10
0
05 Apr 2024
ROPO: Robust Preference Optimization for Large Language Models
ROPO: Robust Preference Optimization for Large Language Models
Xize Liang
Chao Chen
Shuang Qiu
Jie Wang
Yue-bo Wu
Zhihang Fu
Zhihao Shi
Feng Wu
Jieping Ye
86
3
0
05 Apr 2024
Dynamic Prompt Optimizing for Text-to-Image Generation
Dynamic Prompt Optimizing for Text-to-Image Generation
Wenyi Mo
Tianyu Zhang
Yalong Bai
Fuchun Sun
Ji-Rong Wen
Qing Yang
86
13
0
05 Apr 2024
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer
Hele-Andra Kuulmets
Taido Purason
Agnes Luhtaru
Mark Fishel
78
19
0
05 Apr 2024
Can only LLMs do Reasoning?: Potential of Small Language Models in Task
  Planning
Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning
Gawon Choi
Hyemin Ahn
LM&RoLRM
59
1
0
05 Apr 2024
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Jingyu Zhang
Marc Marone
Tianjian Li
Benjamin Van Durme
Daniel Khashabi
195
9
0
05 Apr 2024
CantTalkAboutThis: Aligning Language Models to Stay on Topic in
  Dialogues
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
Makesh Narsimhan Sreedhar
Traian Rebedea
Shaona Ghosh
Jiaqi Zeng
Christopher Parisien
ALM
103
6
0
04 Apr 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with
  General Preferences
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
209
132
0
04 Apr 2024
Evaluating LLMs at Detecting Errors in LLM Responses
Evaluating LLMs at Detecting Errors in LLM Responses
Ryo Kamoi
Sarkar Snigdha Sarathi Das
Renze Lou
Jihyun Janice Ahn
Yilun Zhao
...
Salika Dave
Shaobo Qin
Arman Cohan
Wenpeng Yin
Rui Zhang
86
25
0
04 Apr 2024
Intent Detection and Entity Extraction from BioMedical Literature
Intent Detection and Entity Extraction from BioMedical Literature
Ankan Mullick
Mukur Gupta
Pawan Goyal
MedIm
80
2
0
04 Apr 2024
ReFT: Representation Finetuning for Language Models
ReFT: Representation Finetuning for Language Models
Zhengxuan Wu
Aryaman Arora
Zheng Wang
Atticus Geiger
Daniel Jurafsky
Christopher D. Manning
Christopher Potts
OffRL
122
72
0
04 Apr 2024
Distributionally Robust Reinforcement Learning with Interactive Data
  Collection: Fundamental Hardness and Near-Optimal Algorithm
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
Miao Lu
Han Zhong
Tong Zhang
Jose H. Blanchet
OffRLOOD
107
10
0
04 Apr 2024
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning
  Skills in Large Language Models
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models
Yantao Liu
Zijun Yao
Xin Lv
Yuchen Fan
S. Cao
Jifan Yu
Lei Hou
Juanzi Li
105
3
0
04 Apr 2024
Personalized LLM Response Generation with Parameterized Memory Injection
Personalized LLM Response Generation with Parameterized Memory Injection
Kai Zhang
Lizhi Qing
Yangyang Kang
119
11
0
04 Apr 2024
Scaffolding Language Learning via Multi-modal Tutoring Systems with
  Pedagogical Instructions
Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions
Zhengyuan Liu
Stella Xin Yin
Carolyn Lee
Nancy F. Chen
AI4Ed
65
18
0
04 Apr 2024
LongVLM: Efficient Long Video Understanding via Large Language Models
LongVLM: Efficient Long Video Understanding via Large Language Models
Yuetian Weng
Mingfei Han
Haoyu He
Xiaojun Chang
Bohan Zhuang
VLM
127
65
0
04 Apr 2024
Learning to Plan and Generate Text with Citations
Learning to Plan and Generate Text with Citations
Constanza Fierro
Reinald Kim Amplayo
Fantine Huot
Nicola De Cao
Joshua Maynez
Shashi Narayan
Mirella Lapata
82
19
0
04 Apr 2024
Using Large Language Models to Enrich the Documentation of Datasets for
  Machine Learning
Using Large Language Models to Enrich the Documentation of Datasets for Machine Learning
Joan Giner-Miguelez
Abel Gómez
Jordi Cabot
LLMAG
85
4
0
04 Apr 2024
How Easily do Irrelevant Inputs Skew the Responses of Large Language
  Models?
How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?
Siye Wu
Jian Xie
Jiangjie Chen
Tinghui Zhu
Kai Zhang
Yanghua Xiao
KELM
101
23
0
04 Apr 2024
Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning
  through Logical Fallacy Understanding
Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning through Logical Fallacy Understanding
Yanda Li
Dixuan Wang
Jiaqing Liang
Guochao Jiang
Qi He
Yanghua Xiao
Deqing Yang
LRMELM
111
7
0
04 Apr 2024
Investigating Regularization of Self-Play Language Models
Investigating Regularization of Self-Play Language Models
Réda Alami
Abdalgader Abubaker
Mastane Achab
M. Seddik
Salem Lahlou
75
3
0
04 Apr 2024
Okay, Let's Do This! Modeling Event Coreference with Generated
  Rationales and Knowledge Distillation
Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation
Abhijnan Nath
Shadi Manafi
Avyakta Chelle
Nikhil Krishnaswamy
90
2
0
04 Apr 2024
Do Large Language Models Rank Fairly? An Empirical Study on the Fairness
  of LLMs as Rankers
Do Large Language Models Rank Fairly? An Empirical Study on the Fairness of LLMs as Rankers
Yuan Wang
Xuyang Wu
Hsin-Tai Wu
Zhiqiang Tao
Yi Fang
ALM
78
10
0
04 Apr 2024
Uncertainty in Language Models: Assessment through Rank-Calibration
Uncertainty in Language Models: Assessment through Rank-Calibration
Xinmeng Huang
Shuo Li
Mengxin Yu
Matteo Sesia
Hamed Hassani
Insup Lee
Osbert Bastani
Yan Sun
97
20
0
04 Apr 2024
GPT-DETOX: An In-Context Learning-Based Paraphraser for Text
  Detoxification
GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification
Ali Pesaranghader
Nikhil Verma
Manasa Bharadwaj
93
5
0
03 Apr 2024
Generative AI in the Wild: Prospects, Challenges, and Strategies
Generative AI in the Wild: Prospects, Challenges, and Strategies
Yuan Sun
Eunchae Jang
Fenglong Ma
Ting Wang
75
28
0
03 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
  Prediction
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
127
349
0
03 Apr 2024
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models
  with a Self-Critique Pipeline
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Yifan Xu
Xiao Liu
Xinghan Liu
Zhenyu Hou
Yueyan Li
...
Aohan Zeng
Zhengxiao Du
Wenyi Zhao
Jie Tang
Yuxiao Dong
LRM
103
42
0
03 Apr 2024
Empowering Biomedical Discovery with AI Agents
Empowering Biomedical Discovery with AI Agents
Shanghua Gao
Ada Fang
Yepeng Huang
Valentina Giunchiglia
Ayush Noori
Jonathan Richard Schwarz
Yasha Ektefaie
Jovana Kondic
Marinka Zitnik
LLMAGAI4CE
108
100
0
03 Apr 2024
BAdam: A Memory Efficient Full Parameter Optimization Method for Large
  Language Models
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
Qi Luo
Hengxu Yu
Xiao Li
92
6
0
03 Apr 2024
Conifer: Improving Complex Constrained Instruction-Following Ability of
  Large Language Models
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models
Haoran Sun
Lixin Liu
Junjie Li
Fengyu Wang
Baohua Dong
Ran Lin
Ruohui Huang
76
20
0
03 Apr 2024
Evolving Agents: Interactive Simulation of Dynamic and Diverse Human
  Personalities
Evolving Agents: Interactive Simulation of Dynamic and Diverse Human Personalities
Jiale Li
Jiayang Li
Jiahao Chen
Yifan Li
Shijie Wang
Hugo Zhou
Minjun Ye
Yunsheng Su
AI4CE
114
4
0
03 Apr 2024
Calibrating the Confidence of Large Language Models by Eliciting
  Fidelity
Calibrating the Confidence of Large Language Models by Eliciting Fidelity
Mozhi Zhang
Mianqiu Huang
Rundong Shi
Linsen Guo
Chong Peng
Peng Yan
Yaqian Zhou
Xipeng Qiu
86
13
0
03 Apr 2024
Lifelong Event Detection with Embedding Space Separation and Compaction
Lifelong Event Detection with Embedding Space Separation and Compaction
Chengwei Qin
Ruirui Chen
Ruochen Zhao
Wenhan Xia
Shafiq Joty
KELMCLL
92
2
0
03 Apr 2024
Task Agnostic Architecture for Algorithm Induction via Implicit
  Composition
Task Agnostic Architecture for Algorithm Induction via Implicit Composition
Sahil J. Sindhi
Ignas Budvytis
87
0
0
03 Apr 2024
Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data
Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data
Parth Patwa
Simone Filice
Zhiyu Zoey Chen
Giuseppe Castellucci
Oleg Rokhlenko
S. Malmasi
75
7
0
03 Apr 2024
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng
Zhaohui Wang
Muhan Zhang
VLM
159
104
0
03 Apr 2024
Deconstructing In-Context Learning: Understanding Prompts via Corruption
Deconstructing In-Context Learning: Understanding Prompts via Corruption
Namrata Shivagunde
Vladislav Lialin
Sherin Muckatira
Anna Rumshisky
101
3
0
02 Apr 2024
Risks from Language Models for Automated Mental Healthcare: Ethics and
  Structure for Implementation
Risks from Language Models for Automated Mental Healthcare: Ethics and Structure for Implementation
Declan Grabb
Max Lamparth
N. Vasan
91
17
0
02 Apr 2024
HyperCLOVA X Technical Report
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
88
9
0
02 Apr 2024
VLRM: Vision-Language Models act as Reward Models for Image Captioning
VLRM: Vision-Language Models act as Reward Models for Image Captioning
Maksim Dzabraev
Alexander Kunitsyn
Andrei Ivaniuta
VLMMLLM
73
3
0
02 Apr 2024
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language
  Models -- A Survey
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
Philipp Mondorf
Barbara Plank
ELMLRMLM&MA
170
52
0
02 Apr 2024
Previous
123...838485...126127128
Next