Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,395 papers shown
Title
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Yixiao Zhang
Yukara Ikemiya
Gus Xia
Naoki Murata
Marco A. Martínez-Ramírez
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
138
23
0
09 Feb 2024
Task Supportive and Personalized Human-Large Language Model Interaction: A User Study
Ben Wang
Jiqun Liu
Jamshed Karimnazarov
Nicolas Thompson
52
18
0
09 Feb 2024
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
Zhicheng Zheng
Xin Yan
Zhenfang Chen
Jingzhou Wang
Qin Zhi Eddie Lim
Joshua B. Tenenbaum
Chuang Gan
LRM
76
10
0
09 Feb 2024
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
Siming Yan
Min Bai
Weifeng Chen
Xiong Zhou
Qixing Huang
Erran L. Li
VLM
57
20
0
09 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
248
426
0
09 Feb 2024
Rethinking Data Selection for Supervised Fine-Tuning
Ming Shen
46
21
0
08 Feb 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
Hainiu Xu
Runcong Zhao
Lixing Zhu
Bin Liang
Yulan He
158
25
0
08 Feb 2024
A Prompt Response to the Demand for Automatic Gender-Neutral Translation
Beatrice Savoldi
Andrea Piergentili
Dennis Fucci
Matteo Negri
L. Bentivogli
78
15
0
08 Feb 2024
Large Language Model Meets Graph Neural Network in Knowledge Distillation
Shengxiang Hu
Guobing Zou
Song Yang
Yanglan Gan
Bofeng Zhang
Yixin Chen
118
7
0
08 Feb 2024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi
Wenxiang Chen
Boyang Hong
Senjie Jin
Rui Zheng
...
Xinbo Zhang
Peng Sun
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
74
28
0
08 Feb 2024
UFO: A UI-Focused Agent for Windows OS Interaction
Chaoyun Zhang
Liqun Li
Shilin He
Xu Zhang
Bo Qiao
...
Yu Kang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
LLMAG
178
83
0
08 Feb 2024
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Yunhao Tang
Z. Guo
Zeyu Zheng
Daniele Calandriello
Rémi Munos
Mark Rowland
Pierre Harvey Richemond
Michal Valko
Bernardo Avila-Pires
Bilal Piot
85
121
0
08 Feb 2024
Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents
Yuxi Wei
Zi Wang
Yifan Lu
Chenxin Xu
Chang-rui Liu
Hao Zhao
Siheng Chen
Yanfeng Wang
VGen
131
75
0
08 Feb 2024
TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation
Yikai Zhang
Siyu Yuan
Caiyu Hu
Kyle Richardson
Yanghua Xiao
Jiangjie Chen
AI4CE
LLMAG
73
18
0
08 Feb 2024
Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation
Xianghe Pang
Shuo Tang
Rui Ye
Yuxin Xiong
Bolun Zhang
Yanfeng Wang
Siheng Chen
197
36
0
08 Feb 2024
Rocks Coding, Not Development--A Human-Centric, Experimental Evaluation of LLM-Supported SE Tasks
Wei Wang
Huilong Ning
Gaowei Zhang
Libo Liu
Yi Wang
116
16
0
08 Feb 2024
Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset
Hengguan Huang
Songtao Wang
Hongfu Liu
Hao Wang
Ye Wang
LM&MA
85
4
0
08 Feb 2024
GPTs Are Multilingual Annotators for Sequence Generation Tasks
Juhwan Choi
Eunju Lee
Kyohoon Jin
Youngbin Kim
68
11
0
08 Feb 2024
Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia
Guangyu Shen
Shuyang Cheng
Kai-xian Zhang
Guanhong Tao
Shengwei An
Lu Yan
Zhuo Zhang
Shiqing Ma
Xiangyu Zhang
78
15
0
08 Feb 2024
Large Language Models for Psycholinguistic Plausibility Pretesting
S. Amouyal
A. Meltzer-Asscher
Jonathan Berant
ELM
50
7
0
08 Feb 2024
In-Context Principle Learning from Mistakes
Tianjun Zhang
Aman Madaan
Luyu Gao
Steven Zheng
Swaroop Mishra
Yiming Yang
Niket Tandon
Uri Alon
KELM
ReLM
111
27
0
08 Feb 2024
Noise Contrastive Alignment of Language Models with Explicit Rewards
Huayu Chen
Guande He
Lifan Yuan
Ganqu Cui
Hang Su
Jun Zhu
113
56
0
08 Feb 2024
Principled Preferential Bayesian Optimization
Wenjie Xu
Wenbin Wang
Yuning Jiang
B. Svetozarevic
Colin N. Jones
69
8
0
08 Feb 2024
A Survey on Safe Multi-Modal Learning System
Tianyi Zhao
Liangliang Zhang
Yao Ma
Lu Cheng
158
14
0
08 Feb 2024
Learning mirror maps in policy mirror descent
Carlo Alfano
Sebastian Towers
Silvia Sapora
Chris Xiaoxuan Lu
Patrick Rebeschini
63
0
0
07 Feb 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei
Kaixuan Huang
Yangsibo Huang
Tinghao Xie
Xiangyu Qi
Mengzhou Xia
Prateek Mittal
Mengdi Wang
Peter Henderson
AAML
162
118
0
07 Feb 2024
Pedagogical Alignment of Large Language Models
Shashank Sonkar
Kangqi Ni
Sapana Chaudhary
Richard G. Baraniuk
AI4Ed
44
9
0
07 Feb 2024
An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration
Yihao Li
Ru Zhang
Jianyi Liu
LRM
113
16
0
07 Feb 2024
Reconfidencing LLMs from the Grouping Loss Perspective
Lihu Chen
Alexandre Perez-Lebel
Fabian M. Suchanek
Gaël Varoquaux
310
12
0
07 Feb 2024
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Hao Zhao
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
ALM
169
57
0
07 Feb 2024
Direct Language Model Alignment from Online AI Feedback
Shangmin Guo
Biao Zhang
Tianlin Liu
Tianqi Liu
Misha Khalman
...
Thomas Mesnard
Yao-Min Zhao
Bilal Piot
Johan Ferret
Mathieu Blondel
ALM
114
160
0
07 Feb 2024
A Hypothesis-Driven Framework for the Analysis of Self-Rationalising Models
Marc Braun
Jenny Kunz
42
3
0
07 Feb 2024
InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior
Chenguo Lin
Yadong Mu
3DV
72
40
0
07 Feb 2024
Source Identification in Abstractive Summarization
Yoshi Suhara
Dimitris Alikaniotis
67
2
0
07 Feb 2024
TransLLaMa: LLM-based Simultaneous Translation System
Roman Koshkin
Katsuhito Sudoh
Satoshi Nakamura
55
26
0
07 Feb 2024
SPARQL Generation: an analysis on fine-tuning OpenLLaMA for Question Answering over a Life Science Knowledge Graph
Julio Cesar Rangel Reyes
T. M. Farias
A. Sima
Norio Kobayashi
59
16
0
07 Feb 2024
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao
Pengle Zhang
Xu Han
Guangxuan Xiao
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
LLMAG
132
55
0
07 Feb 2024
Alirector: Alignment-Enhanced Chinese Grammatical Error Corrector
Haihui Yang
Xiaojun Quan
3DV
112
3
0
07 Feb 2024
S-Agents: Self-organizing Agents in Open-ended Environments
Jia-Qing Chen
Yu-Gang Jiang
Jiachen Lu
Li Zhang
AIFin
LLMAG
LM&Ro
98
16
0
07 Feb 2024
Can Large Language Model Agents Simulate Human Trust Behaviors?
Chengxing Xie
Canyu Chen
Feiran Jia
Ziyu Ye
Kai Shu
Adel Bibi
Ziniu Hu
Philip Torr
Guohao Li
Ge Li
LM&Ro
LLMAG
138
60
0
07 Feb 2024
Detecting Mode Collapse in Language Models via Narration
Sil Hamilton
51
9
0
06 Feb 2024
DFA-RAG: Conversational Semantic Router for Large Language Model with Definite Finite Automaton
Yiyou Sun
Junjie Hu
Wei Cheng
Haifeng Chen
RALM
AI4CE
107
1
0
06 Feb 2024
LegalLens: Leveraging LLMs for Legal Violation Identification in Unstructured Text
Dor Bernsohn
Gil Semo
Yaron Vazana
Gila Hayat
Ben Hagag
Joel Niklaus
Rohit Saha
Kyryl Truskovskyi
AILaw
68
18
0
06 Feb 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
166
247
0
06 Feb 2024
SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark
Zhenwen Liang
Kehan Guo
Gang Liu
Taicheng Guo
Yujun Zhou
Tianyu Yang
Jiajun Jiao
Renjie Pi
Jipeng Zhang
Xiangliang Zhang
ELM
90
24
0
06 Feb 2024
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
Yu Du
Fangyun Wei
Hongyang R. Zhang
LLMAG
103
46
0
06 Feb 2024
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Mantas Mazeika
Long Phan
Xuwang Yin
Andy Zou
Zifan Wang
...
Nathaniel Li
Steven Basart
Bo Li
David A. Forsyth
Dan Hendrycks
AAML
125
419
0
06 Feb 2024
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science
Xiangru Tang
Qiao Jin
Kunlun Zhu
Tongxin Yuan
Yichi Zhang
...
Jian Tang
Zhuosheng Zhang
Arman Cohan
Zhiyong Lu
Mark B. Gerstein
LLMAG
ELM
119
47
0
06 Feb 2024
MusicRL: Aligning Music Generation to Human Preferences
Geoffrey Cideron
Sertan Girgin
Mauro Verzetti
Damien Vincent
Matej Kastelic
...
Olivier Pietquin
Matthieu Geist
Léonard Hussenot
Neil Zeghidour
A. Agostinelli
91
22
0
06 Feb 2024
Harnessing the Plug-and-Play Controller by Prompting
Hao Wang
Lei Sha
73
4
0
06 Feb 2024
Previous
1
2
3
...
99
100
101
...
126
127
128
Next