Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.01325
Cited By
v1
v2
v3 (latest)
Learning to summarize from human feedback
2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning to summarize from human feedback"
48 / 1,548 papers shown
Title
Integrating Human-in-the-loop into Swarm Learning for Decentralized Fake News Detection
Xishuang Dong
Lijun Qian
87
9
0
04 Jan 2022
WebGPT: Browser-assisted question-answering with human feedback
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
...
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
ALM
RALM
202
1,299
0
17 Dec 2021
Reframing Human-AI Collaboration for Generating Free-Text Explanations
Sarah Wiegreffe
Jack Hessel
Swabha Swayamdipta
Mark O. Riedl
Yejin Choi
77
149
0
16 Dec 2021
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
145
791
0
01 Dec 2021
Expressive Communication: A Common Framework for Evaluating Developments in Generative Models and Steering Interfaces
Ryan Louie
Jesse Engel
Cheng-Zhi Anna Huang
60
9
0
29 Nov 2021
Robust Deep Reinforcement Learning for Extractive Legal Summarization
Duy-Hung Nguyen
Bao-Sinh Nguyen
Nguyen-Viet-Dung Nghiem
Dung Tien Le
Mim Amina Khatun
Minh Le Nguyen
Hung Le
ELM
AILaw
AI4TS
126
18
0
13 Nov 2021
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
125
101
0
04 Nov 2021
Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research
Ross Gruetzemacher
D. Paradice
94
35
0
18 Oct 2021
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail
Sam Bowman
OffRL
120
45
0
15 Oct 2021
Truthful AI: Developing and governing AI that does not lie
Owain Evans
Owen Cotton-Barratt
Lukas Finnveden
Adam Bales
Avital Balwit
Peter Wills
Luca Righetti
William Saunders
HILM
307
118
0
13 Oct 2021
Calibrate your listeners! Robust communication-based training for pragmatic speakers
Rose E. Wang
Julia White
Jesse Mu
Noah D. Goodman
69
7
0
11 Oct 2021
An Empirical Investigation of Learning from Biased Toxicity Labels
Neel Nanda
J. Uesato
Sven Gowal
31
0
0
04 Oct 2021
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
262
303
0
22 Sep 2021
Learning Natural Language Generation from Scratch
Alice Martin Donati
Guillaume Quispe
Charles Ollion
Sylvain Le Corff
Florian Strub
Olivier Pietquin
LRM
58
4
0
20 Sep 2021
Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning
Li Zhou
Kevin Small
Yong Zhang
Sandeep Atluri
83
2
0
10 Sep 2021
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Stephanie C. Lin
Jacob Hilton
Owain Evans
HILM
163
1,956
0
08 Sep 2021
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Noriyuki Kojima
Alane Suhr
Yoav Artzi
74
25
0
10 Aug 2021
High Quality Related Search Query Suggestions using Deep Reinforcement Learning
Praveen Kumar Bodigutla
AI4TS
56
2
0
10 Aug 2021
A Survey of Human-in-the-loop for Machine Learning
Xingjiao Wu
Luwei Xiao
Yixuan Sun
Junhang Zhang
Tianlong Ma
Liangbo He
SyDa
137
533
0
02 Aug 2021
Pragmatic Image Compression for Human-in-the-Loop Decision-Making
S. Reddy
Anca Dragan
Sergey Levine
OffRL
86
13
0
07 Jul 2021
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
302
5,702
0
07 Jul 2021
Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence
Alexander Miserlis Hoyle
Pranav Goel
Denis Peskov
Andrew Hian-Cheong
Jordan L. Boyd-Graber
Philip Resnik
166
132
0
05 Jul 2021
The MineRL BASALT Competition on Learning from Human Feedback
Rohin Shah
Cody Wild
Steven H. Wang
Neel Alex
Brandon Houghton
...
Stephanie Milani
Nicholay Topin
Pieter Abbeel
Stuart J. Russell
Anca Dragan
93
32
0
05 Jul 2021
Cogment: Open Source Framework For Distributed Multi-actor Training, Deployment & Operations
AI Redefined
S. Gottipati
Sagar Kurandwad
Clodéric Mars
Gregory Szriftgiser
Franccois Chabot
67
8
0
21 Jun 2021
Diversity driven Query Rewriting in Search Advertising
Akash Kumar Mohankumar
Nikit Begwani
Amit Singh
54
26
0
07 Jun 2021
Grounding 'Grounding' in NLP
Khyathi Chandu
Yonatan Bisk
A. Black
101
54
0
04 Jun 2021
Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution
Jiacheng Xu
Greg Durrett
99
16
0
03 Jun 2021
Uni-Encoder: A Fast and Accurate Response Selection Paradigm for Generation-Based Dialogue Systems
Chiyu Song
Hongliang He
Haofei Yu
Pengfei Fang
Leyang Cui
Zhenzhong Lan
79
6
0
02 Jun 2021
Hone as You Read: A Practical Type of Interactive Summarization
Tanner A. Bohn
Charles X. Ling
67
9
0
06 May 2021
Reliability Testing for Natural Language Processing Systems
Samson Tan
Shafiq Joty
K. Baxter
Araz Taeihagh
G. Bennett
Min-Yen Kan
103
41
0
06 May 2021
Multitasking Inhibits Semantic Drift
Athul Paul Jacob
M. Lewis
Jacob Andreas
88
13
0
15 Apr 2021
Learning What To Do by Simulating the Past
David Lindner
Rohin Shah
Pieter Abbeel
Anca Dragan
40
4
0
08 Apr 2021
Dynabench: Rethinking Benchmarking in NLP
Douwe Kiela
Max Bartolo
Yixin Nie
Divyansh Kaushik
Atticus Geiger
...
Pontus Stenetorp
Robin Jia
Joey Tianyi Zhou
Christopher Potts
Adina Williams
218
411
0
07 Apr 2021
Creativity and Machine Learning: A Survey
Giorgio Franceschelli
Mirco Musolesi
VLM
AI4CE
129
43
0
06 Apr 2021
Alignment of Language Agents
Zachary Kenton
Tom Everitt
Laura Weidinger
Iason Gabriel
Vladimir Mikulik
G. Irving
93
166
0
26 Mar 2021
Constrained Text Generation with Global Guidance -- Case Study on CommonGen
Yixian Liu
Liwen Zhang
Wenjuan Han
Yue Zhang
Kewei Tu
87
10
0
12 Mar 2021
Putting Humans in the Natural Language Processing Loop: A Survey
Zijie J. Wang
Dongjin Choi
Shenyu Xu
Diyi Yang
LM&MA
105
74
0
06 Mar 2021
Symbolic Behaviour in Artificial Intelligence
Adam Santoro
Andrew Kyle Lampinen
Kory W. Mathewson
Timothy Lillicrap
David Raposo
83
34
0
05 Feb 2021
Scaling Laws for Transfer
Danny Hernandez
Jared Kaplan
T. Henighan
Sam McCandlish
100
251
0
02 Feb 2021
Evaluating the Robustness of Collaborative Agents
P. Knott
Micah Carroll
Sam Devlin
K. Ciosek
Katja Hofmann
Anca Dragan
Rohin Shah
67
36
0
14 Jan 2021
Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning
Jerry Zikun Chen
S. Yu
Haoran Wang
444
5
0
18 Dec 2020
Open Problems in Cooperative AI
Allan Dafoe
Edward Hughes
Yoram Bachrach
Tantum Collins
Kevin R. McKee
Joel Z Leibo
Kate Larson
T. Graepel
128
203
0
15 Dec 2020
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks
Julia Kreutzer
Stefan Riezler
Carolin (Haas) Lawrence
RALM
OffRL
74
15
0
04 Nov 2020
Constrained Abstractive Summarization: Preserving Factual Consistency with Constrained Generation
Yuning Mao
Xiang Ren
Heng Ji
Jiawei Han
HILM
196
39
0
24 Oct 2020
What Have We Achieved on Text Summarization?
Dandan Huang
Leyang Cui
Sen Yang
Guangsheng Bao
Kun Wang
Jun Xie
Yue Zhang
126
109
0
09 Oct 2020
Current Limitations of Language Models: What You Need is Retrieval
Aran Komatsuzaki
LRM
39
3
0
15 Sep 2020
A Survey of Evaluation Metrics Used for NLG Systems
Ananya B. Sai
Akash Kumar Mohankumar
Mitesh M. Khapra
ELM
99
237
0
27 Aug 2020
SummEval: Re-evaluating Summarization Evaluation
Alexander R. Fabbri
Wojciech Kry'sciñski
Bryan McCann
Caiming Xiong
R. Socher
Dragomir R. Radev
HILM
161
725
0
24 Jul 2020
Previous
1
2
3
...
29
30
31