Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,381 papers shown
Title
JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups
Simindokht Jahangard
Zhixi Cai
Shiki Wen
Hamid Rezatofighi
56
6
0
06 Apr 2024
Binary Classifier Optimization for Large Language Model Alignment
Seungjae Jung
Gunsoo Han
D. W. Nam
Kyoung-Woon On
82
25
0
06 Apr 2024
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
179
403
0
06 Apr 2024
Bayesian Additive Regression Networks
D. V. Boxel
UQCV
BDL
82
0
0
05 Apr 2024
Prompt Public Large Language Models to Synthesize Data for Private On-device Applications
Shanshan Wu
Zheng Xu
Yanxiang Zhang
Yuanbo Zhang
Daniel Ramage
SyDa
88
12
0
05 Apr 2024
Pixel-wise RL on Diffusion Models: Reinforcement Learning from Rich Feedback
Mo Kordzanganeh
Danial Keshvary
Nariman Arian
EGVM
52
0
0
05 Apr 2024
Scope Ambiguities in Large Language Models
Gaurav Kamath
Sebastian Schuster
Sowmya Vajjala
Siva Reddy
69
6
0
05 Apr 2024
Social Skill Training with Large Language Models
Diyi Yang
Caleb Ziems
William B. Held
Omar Shaikh
Michael S. Bernstein
John C. Mitchell
LLMAG
80
11
0
05 Apr 2024
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
Xinyu Ma
Xu Chu
Zhibang Yang
Yang Lin
Xin Gao
Junfeng Zhao
96
10
0
05 Apr 2024
ROPO: Robust Preference Optimization for Large Language Models
Xize Liang
Chao Chen
Shuang Qiu
Jie Wang
Yue-bo Wu
Zhihang Fu
Zhihao Shi
Feng Wu
Jieping Ye
86
3
0
05 Apr 2024
Dynamic Prompt Optimizing for Text-to-Image Generation
Wenyi Mo
Tianyu Zhang
Yalong Bai
Fuchun Sun
Ji-Rong Wen
Qing Yang
86
13
0
05 Apr 2024
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer
Hele-Andra Kuulmets
Taido Purason
Agnes Luhtaru
Mark Fishel
78
19
0
05 Apr 2024
Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning
Gawon Choi
Hyemin Ahn
LM&Ro
LRM
59
1
0
05 Apr 2024
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Jingyu Zhang
Marc Marone
Tianjian Li
Benjamin Van Durme
Daniel Khashabi
195
9
0
05 Apr 2024
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
Makesh Narsimhan Sreedhar
Traian Rebedea
Shaona Ghosh
Jiaqi Zeng
Christopher Parisien
ALM
103
6
0
04 Apr 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
209
132
0
04 Apr 2024
Evaluating LLMs at Detecting Errors in LLM Responses
Ryo Kamoi
Sarkar Snigdha Sarathi Das
Renze Lou
Jihyun Janice Ahn
Yilun Zhao
...
Salika Dave
Shaobo Qin
Arman Cohan
Wenpeng Yin
Rui Zhang
86
25
0
04 Apr 2024
Intent Detection and Entity Extraction from BioMedical Literature
Ankan Mullick
Mukur Gupta
Pawan Goyal
MedIm
80
2
0
04 Apr 2024
ReFT: Representation Finetuning for Language Models
Zhengxuan Wu
Aryaman Arora
Zheng Wang
Atticus Geiger
Daniel Jurafsky
Christopher D. Manning
Christopher Potts
OffRL
122
72
0
04 Apr 2024
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
Miao Lu
Han Zhong
Tong Zhang
Jose H. Blanchet
OffRL
OOD
107
10
0
04 Apr 2024
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models
Yantao Liu
Zijun Yao
Xin Lv
Yuchen Fan
S. Cao
Jifan Yu
Lei Hou
Juanzi Li
105
3
0
04 Apr 2024
Personalized LLM Response Generation with Parameterized Memory Injection
Kai Zhang
Lizhi Qing
Yangyang Kang
119
11
0
04 Apr 2024
Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions
Zhengyuan Liu
Stella Xin Yin
Carolyn Lee
Nancy F. Chen
AI4Ed
65
18
0
04 Apr 2024
LongVLM: Efficient Long Video Understanding via Large Language Models
Yuetian Weng
Mingfei Han
Haoyu He
Xiaojun Chang
Bohan Zhuang
VLM
127
65
0
04 Apr 2024
Learning to Plan and Generate Text with Citations
Constanza Fierro
Reinald Kim Amplayo
Fantine Huot
Nicola De Cao
Joshua Maynez
Shashi Narayan
Mirella Lapata
82
19
0
04 Apr 2024
Using Large Language Models to Enrich the Documentation of Datasets for Machine Learning
Joan Giner-Miguelez
Abel Gómez
Jordi Cabot
LLMAG
85
4
0
04 Apr 2024
How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?
Siye Wu
Jian Xie
Jiangjie Chen
Tinghui Zhu
Kai Zhang
Yanghua Xiao
KELM
101
23
0
04 Apr 2024
Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning through Logical Fallacy Understanding
Yanda Li
Dixuan Wang
Jiaqing Liang
Guochao Jiang
Qi He
Yanghua Xiao
Deqing Yang
LRM
ELM
111
7
0
04 Apr 2024
Investigating Regularization of Self-Play Language Models
Réda Alami
Abdalgader Abubaker
Mastane Achab
M. Seddik
Salem Lahlou
75
3
0
04 Apr 2024
Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation
Abhijnan Nath
Shadi Manafi
Avyakta Chelle
Nikhil Krishnaswamy
90
2
0
04 Apr 2024
Do Large Language Models Rank Fairly? An Empirical Study on the Fairness of LLMs as Rankers
Yuan Wang
Xuyang Wu
Hsin-Tai Wu
Zhiqiang Tao
Yi Fang
ALM
78
10
0
04 Apr 2024
Uncertainty in Language Models: Assessment through Rank-Calibration
Xinmeng Huang
Shuo Li
Mengxin Yu
Matteo Sesia
Hamed Hassani
Insup Lee
Osbert Bastani
Yan Sun
97
20
0
04 Apr 2024
GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification
Ali Pesaranghader
Nikhil Verma
Manasa Bharadwaj
93
5
0
03 Apr 2024
Generative AI in the Wild: Prospects, Challenges, and Strategies
Yuan Sun
Eunchae Jang
Fenglong Ma
Ting Wang
75
28
0
03 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
127
349
0
03 Apr 2024
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Yifan Xu
Xiao Liu
Xinghan Liu
Zhenyu Hou
Yueyan Li
...
Aohan Zeng
Zhengxiao Du
Wenyi Zhao
Jie Tang
Yuxiao Dong
LRM
103
42
0
03 Apr 2024
Empowering Biomedical Discovery with AI Agents
Shanghua Gao
Ada Fang
Yepeng Huang
Valentina Giunchiglia
Ayush Noori
Jonathan Richard Schwarz
Yasha Ektefaie
Jovana Kondic
Marinka Zitnik
LLMAG
AI4CE
108
100
0
03 Apr 2024
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
Qi Luo
Hengxu Yu
Xiao Li
92
6
0
03 Apr 2024
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models
Haoran Sun
Lixin Liu
Junjie Li
Fengyu Wang
Baohua Dong
Ran Lin
Ruohui Huang
76
20
0
03 Apr 2024
Evolving Agents: Interactive Simulation of Dynamic and Diverse Human Personalities
Jiale Li
Jiayang Li
Jiahao Chen
Yifan Li
Shijie Wang
Hugo Zhou
Minjun Ye
Yunsheng Su
AI4CE
114
4
0
03 Apr 2024
Calibrating the Confidence of Large Language Models by Eliciting Fidelity
Mozhi Zhang
Mianqiu Huang
Rundong Shi
Linsen Guo
Chong Peng
Peng Yan
Yaqian Zhou
Xipeng Qiu
86
13
0
03 Apr 2024
Lifelong Event Detection with Embedding Space Separation and Compaction
Chengwei Qin
Ruirui Chen
Ruochen Zhao
Wenhan Xia
Shafiq Joty
KELM
CLL
92
2
0
03 Apr 2024
Task Agnostic Architecture for Algorithm Induction via Implicit Composition
Sahil J. Sindhi
Ignas Budvytis
87
0
0
03 Apr 2024
Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data
Parth Patwa
Simone Filice
Zhiyu Zoey Chen
Giuseppe Castellucci
Oleg Rokhlenko
S. Malmasi
75
7
0
03 Apr 2024
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng
Zhaohui Wang
Muhan Zhang
VLM
159
104
0
03 Apr 2024
Deconstructing In-Context Learning: Understanding Prompts via Corruption
Namrata Shivagunde
Vladislav Lialin
Sherin Muckatira
Anna Rumshisky
101
3
0
02 Apr 2024
Risks from Language Models for Automated Mental Healthcare: Ethics and Structure for Implementation
Declan Grabb
Max Lamparth
N. Vasan
91
17
0
02 Apr 2024
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
88
9
0
02 Apr 2024
VLRM: Vision-Language Models act as Reward Models for Image Captioning
Maksim Dzabraev
Alexander Kunitsyn
Andrei Ivaniuta
VLM
MLLM
73
3
0
02 Apr 2024
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
Philipp Mondorf
Barbara Plank
ELM
LRM
LM&MA
170
52
0
02 Apr 2024
Previous
1
2
3
...
83
84
85
...
126
127
128
Next