Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.08593
Cited By
v1
v2 (latest)
Fine-Tuning Language Models from Human Preferences
18 September 2019
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Fine-Tuning Language Models from Human Preferences"
50 / 1,265 papers shown
Title
Augmented Language Models: a Survey
Grégoire Mialon
Roberto Dessì
Maria Lomeli
Christoforos Nalmpantis
Ramakanth Pasunuru
...
Jane Dwivedi-Yu
Asli Celikyilmaz
Edouard Grave
Yann LeCun
Thomas Scialom
LRM
KELM
99
391
0
15 Feb 2023
Conversational AI-Powered Design: ChatGPT as Designer, User, and Product
A. Kocaballi
65
40
0
15 Feb 2023
MarioGPT: Open-Ended Text2Level Generation through Large Language Models
Shyam Sudhakaran
Miguel González Duque
Claire Glanois
Matthias Anton Freiberger
Elias Najarro
S. Risi
VLM
100
58
0
12 Feb 2023
The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Tianjun Zhang
Fangchen Liu
Justin Wong
Pieter Abbeel
Joseph E. Gonzalez
103
47
0
10 Feb 2023
Leveraging Demonstrations to Improve Online Learning: Quality Matters
Botao Hao
Rahul Jain
Tor Lattimore
Benjamin Van Roy
Zheng Wen
119
11
0
07 Feb 2023
Chain of Hindsight Aligns Language Models with Feedback
Hao Liu
Carmelo Sferrazza
Pieter Abbeel
ALM
139
124
0
06 Feb 2023
Regulating ChatGPT and other Large Generative AI Models
P. Hacker
A. Engel
M. Mauer
AILaw
148
353
0
05 Feb 2023
Using In-Context Learning to Improve Dialogue Safety
Nicholas Meade
Spandana Gella
Devamanyu Hazarika
Prakhar Gupta
Di Jin
Siva Reddy
Yang Liu
Dilek Z. Hakkani-Tür
121
39
0
02 Feb 2023
Benchmarking Large Language Models for News Summarization
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Percy Liang
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
128
535
0
31 Jan 2023
Conversational Automated Program Repair
Chun Xia
Lingming Zhang
KELM
108
74
0
30 Jan 2023
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
E. Mitchell
Yoonho Lee
Alexander Khazatsky
Christopher D. Manning
Chelsea Finn
127
632
0
26 Jan 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
K
K
K
-wise Comparisons
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
147
209
0
26 Jan 2023
On The Fragility of Learned Reward Functions
Lev McKinney
Yawen Duan
David M. Krueger
Adam Gleave
89
20
0
09 Jan 2023
Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes
Justin Reppert
Ben Rachbach
Charlie George
Luke Stebbing
Ju-Seung Byun
Maggie Appleton
Andreas Stuhlmuller
ReLM
LRM
133
17
0
04 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
189
36
0
01 Jan 2023
Inclusive Artificial Intelligence
Dilip Arumugam
Shi Dong
Benjamin Van Roy
67
1
0
24 Dec 2022
Methodological reflections for AI alignment research using human feedback
Thilo Hagendorff
Sarah Fabi
73
6
0
22 Dec 2022
Critic-Guided Decoding for Controlled Text Generation
Minbeom Kim
Hwanhee Lee
Kang Min Yoo
Joonsuk Park
Hwaran Lee
Kyomin Jung
115
36
0
21 Dec 2022
JASMINE: Arabic GPT Models for Few-Shot Learning
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
AbdelRahim Elmadany
Alcides Alcoba Inciarte
Md. Tawkat Islam Khondaker
77
8
0
21 Dec 2022
Human-in-the-loop Abstractive Dialogue Summarization
Jiaao Chen
Mohan Dodda
Diyi Yang
79
10
0
19 Dec 2022
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
108
102
0
19 Dec 2022
Continual Learning for Instruction Following from Realtime Feedback
Alane Suhr
Yoav Artzi
74
18
0
19 Dec 2022
Optimizing Prompts for Text-to-Image Generation
Y. Hao
Zewen Chi
Li Dong
Furu Wei
125
152
0
19 Dec 2022
Controllable Text Generation via Probability Density Estimation in the Latent Space
Yuxuan Gu
Xiaocheng Feng
Sicheng Ma
Lingyuan Zhang
Heng Gong
Weihong Zhong
Bing Qin
90
18
0
16 Dec 2022
An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws
Hong Jun Jeon
Benjamin Van Roy
72
0
0
02 Dec 2022
Fine-tuning language models to find agreement among humans with diverse preferences
Michiel A. Bakker
Martin Chadwick
Hannah R. Sheahan
Michael Henry Tessler
Lucy Campbell-Gillingham
...
Nat McAleese
Amelia Glaese
John Aslanides
M. Botvinick
Christopher Summerfield
ALM
110
237
0
28 Nov 2022
Solving math word problems with process- and outcome-based feedback
J. Uesato
Nate Kushman
Ramana Kumar
Francis Song
Noah Y. Siegel
L. Wang
Antonia Creswell
G. Irving
I. Higgins
FaML
ReLM
AIMat
LRM
133
362
0
25 Nov 2022
Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback
Josh Abramson
Arun Ahuja
Federico Carnevale
Petko Georgiev
Alex Goldin
...
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Rui Zhu
76
28
0
21 Nov 2022
Machine Learning Approaches for Principle Prediction in Naturally Occurring Stories
Md Sultan al Nahian
Spencer Frazier
Brent Harrison
Mark O. Riedl
FaML
AI4TS
79
0
0
19 Nov 2022
GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation
Biyang Guo
Yeyun Gong
Yelong Shen
Songqiao Han
Hailiang Huang
Nan Duan
Weizhu Chen
VLM
80
19
0
18 Nov 2022
The Expertise Problem: Learning from Specialized Feedback
Oliver Daniels-Koch
Rachel Freedman
OffRL
67
18
0
12 Nov 2022
The CRINGE Loss: Learning what language not to model
Leonard Adolphs
Tianyu Gao
Jing Xu
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
MU
95
37
0
10 Nov 2022
Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control
Xiang Fan
Yiwei Lyu
Paul Pu Liang
Ruslan Salakhutdinov
Louis-Philippe Morency
BDL
96
6
0
10 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
76
36
0
07 Nov 2022
Do Users Write More Insecure Code with AI Assistants?
Neil Perry
Megha Srivastava
Deepak Kumar
Dan Boneh
ELM
AAML
77
180
0
07 Nov 2022
Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
Yu Meng
Martin Michalski
Jiaxin Huang
Yu Zhang
Tarek Abdelzaher
Jiawei Han
VLM
120
49
0
06 Nov 2022
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
Andrei Ioan Muresanu
Ziwen Han
Keiran Paster
Silviu Pitis
Harris Chan
Jimmy Ba
ALM
LLMAG
195
904
0
03 Nov 2022
Fine-Tuning Language Models via Epistemic Neural Networks
Ian Osband
S. Asghari
Benjamin Van Roy
Nat McAleese
John Aslanides
G. Irving
UQLM
81
20
0
03 Nov 2022
Generating Sequences by Learning to Self-Correct
Sean Welleck
Ximing Lu
Peter West
Faeze Brahman
T. Shen
Daniel Khashabi
Yejin Choi
LRM
111
238
0
31 Oct 2022
Evaluating Long-Term Memory in 3D Mazes
J. Pašukonis
Timothy Lillicrap
Danijar Hafner
3DV
88
23
0
24 Oct 2022
Language Detoxification with Attribute-Discriminative Latent Space
Jin Myung Kwak
Minseon Kim
Sung Ju Hwang
63
14
0
19 Oct 2022
Mitigating Covertly Unsafe Text within Natural Language Systems
Alex Mei
Anisha Kabir
Sharon Levy
Melanie Subbiah
Emily Allaway
J. Judge
D. Patton
Bruce Bimber
Kathleen McKeown
William Yang Wang
122
13
0
17 Oct 2022
A Distributional Lens for Multi-Aspect Controllable Text Generation
Yuxuan Gu
Xiaocheng Feng
Sicheng Ma
Lingyuan Zhang
Heng Gong
Bing Qin
176
37
0
06 Oct 2022
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
Zhijing Jin
Sydney Levine
Fernando Gonzalez
Ojasv Kamal
Maarten Sap
Mrinmaya Sachan
Rada Mihalcea
J. Tenenbaum
Bernhard Schölkopf
ELM
LRM
103
103
0
04 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
105
250
0
03 Oct 2022
Calibrating Sequence likelihood Improves Conditional Language Generation
Yao-Min Zhao
Misha Khalman
Rishabh Joshi
Shashi Narayan
Mohammad Saleh
Peter J. Liu
UQLM
111
135
0
30 Sep 2022
On the Impossible Safety of Large AI Models
El-Mahdi El-Mhamdi
Sadegh Farhadkhani
R. Guerraoui
Nirupam Gupta
L. Hoang
Rafael Pinot
Sébastien Rouault
John Stephan
110
33
0
30 Sep 2022
Argumentative Reward Learning: Reasoning About Human Preferences
Francis Rhys Ward
Francesco Belardinelli
Francesca Toni
HAI
147
2
0
28 Sep 2022
APPDIA: A Discourse-aware Transformer-based Style Transfer Model for Offensive Social Media Conversations
Katherine Atwell
Sabit Hassan
Malihe Alikhani
89
31
0
17 Sep 2022
Selective Token Generation for Few-shot Natural Language Generation
DaeJin Jo
Taehwan Kwon
Eun-Sol Kim
Sungwoong Kim
70
1
0
17 Sep 2022
Previous
1
2
3
...
22
23
24
25
26
Next