Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.11462
Cited By
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
24 September 2020
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models"
50 / 772 papers shown
Title
Text Generation with Text-Editing Models
Eric Malmi
Yue Dong
Jonathan Mallinson
A. Chuklin
Jakub Adamek
Daniil Mirylenka
Felix Stahlberg
Sebastian Krause
Shankar Kumar
Aliaksei Severyn
KELM
41
25
0
14 Jun 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
68
206
0
26 May 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
Hyunwoo J. Kim
Youngjae Yu
Liwei Jiang
Ximing Lu
Daniel Khashabi
Gunhee Kim
Yejin Choi
Maarten Sap
25
119
0
25 May 2022
Gradient-Based Constrained Sampling from Language Models
Sachin Kumar
Biswajit Paria
Yulia Tsvetkov
BDL
39
53
0
25 May 2022
Challenges in Measuring Bias via Open-Ended Language Generation
Afra Feyza Akyürek
Muhammed Yusuf Kocyigit
Sejin Paik
Derry Wijaya
43
22
0
23 May 2022
RL with KL penalties is better viewed as Bayesian inference
Tomasz Korbak
Ethan Perez
Christopher L. Buckley
OffRL
38
73
0
23 May 2022
TempLM: Distilling Language Models into Template-Based Generators
Tianyi Zhang
Mina Lee
Lisa Li
Ende Shen
Tatsunori B. Hashimoto
VLM
45
5
0
23 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
31
187
0
22 May 2022
"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset
Eric Michael Smith
Melissa Hall
Melanie Kambadur
Eleonora Presani
Adina Williams
83
130
0
18 May 2022
Classifiers are Better Experts for Controllable Text Generation
Askhat Sitdikov
Nikita Balagansky
Daniil Gavrilov
Alexander Markov
38
7
0
15 May 2022
Mitigating Toxic Degeneration with Empathetic Data: Exploring the Relationship Between Toxicity and Empathy
Allison Lahnala
Charles F Welch
Béla Neuendorf
Lucie Flek
65
13
0
15 May 2022
Efficient and Training-Free Control of Language Generation
Shangda Wu
Maosong Sun
24
1
0
12 May 2022
Richer Countries and Richer Representations
Kaitlyn Zhou
Kawin Ethayarajh
Dan Jurafsky
46
9
0
10 May 2022
Robust Conversational Agents against Imperceptible Toxicity Triggers
Ninareh Mehrabi
Ahmad Beirami
Fred Morstatter
Aram Galstyan
AAML
26
32
0
05 May 2022
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
97
3,522
0
02 May 2022
Detoxifying Language Models with a Toxic Corpus
Yoon A Park
Frank Rudzicz
27
6
0
30 Apr 2022
Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models
Sanghwan Bae
Donghyun Kwak
Sungdong Kim
Dong-hyun Ham
Soyoung Kang
Sang-Woo Lee
W. Park
ALM
30
37
0
30 Apr 2022
Handling and Presenting Harmful Text in NLP Research
Hannah Rose Kirk
Abeba Birhane
Bertie Vidgen
Leon Derczynski
26
47
0
29 Apr 2022
Training Language Models with Language Feedback
Jérémy Scheurer
Jon Ander Campos
Jun Shern Chan
Angelica Chen
Kyunghyun Cho
Ethan Perez
ALM
48
48
0
29 Apr 2022
Instilling Type Knowledge in Language Models via Multi-Task QA
Shuyang Li
Mukund Sridhar
Chandan Prakash
Jin Cao
Wael Hamza
Julian McAuley
KELM
33
6
0
28 Apr 2022
On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Roy Schwartz
Gabriel Stanovsky
37
26
0
27 Apr 2022
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models
Mor Geva
Avi Caciularu
Guy Dar
Paul Roit
Shoval Sadde
Micah Shlain
Bar Tamir
Yoav Goldberg
KELM
35
27
0
26 Apr 2022
Which Discriminator for Cooperative Text Generation?
Antoine Chaffin
Thomas Scialom
Sylvain Lamprier
Jacopo Staiano
Benjamin Piwowarski
Ewa Kijak
Vincent Claveau
22
4
0
25 Apr 2022
A Review on Language Models as Knowledge Bases
Badr AlKhamissi
Millicent Li
Asli Celikyilmaz
Mona T. Diab
Marjan Ghazvininejad
KELM
44
175
0
12 Apr 2022
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems
Caleb Ziems
Jane A. Yu
Yi-Chia Wang
A. Halevy
Diyi Yang
28
92
0
06 Apr 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
136
6,035
0
05 Apr 2022
Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study
Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
24
43
0
04 Apr 2022
PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model
Fei Mi
Yitong Li
Yulong Zeng
Jingyan Zhou
Yasheng Wang
Chuanfei Xu
Lifeng Shang
Xin Jiang
Shiqi Zhao
Qun Liu
ALM
45
18
0
31 Mar 2022
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
69
1,852
0
29 Mar 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
71
338
0
28 Mar 2022
Mix and Match: Learning-free Controllable Text Generation using Energy Language Models
Fatemehsadat Mireshghallah
Kartik Goyal
Taylor Berg-Kirkpatrick
36
78
0
24 Mar 2022
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
Thomas Hartvigsen
Saadia Gabriel
Hamid Palangi
Maarten Sap
Dipankar Ray
Ece Kamar
38
353
0
17 Mar 2022
Leashing the Inner Demons: Self-Detoxification for Language Models
Canwen Xu
Zexue He
Zhankui He
Julian McAuley
22
26
0
06 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
381
12,081
0
04 Mar 2022
Controllable Natural Language Generation with Contrastive Prefixes
Jing Qian
Li Dong
Yelong Shen
Furu Wei
Weizhu Chen
10
95
0
27 Feb 2022
AugESC: Dialogue Augmentation with Large Language Models for Emotional Support Conversation
Chujie Zheng
Sahand Sabour
Jiaxin Wen
Zheng Zhang
Minlie Huang
24
57
0
26 Feb 2022
Capturing Failures of Large Language Models via Human Cognitive Biases
Erik Jones
Jacob Steinhardt
33
91
0
24 Feb 2022
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models
Farshid Faal
K. Schmitt
Jia Yuan Yu
13
24
0
19 Feb 2022
'Beach' to 'Bitch': Inadvertent Unsafe Transcription of Kids' Content on YouTube
Krithika Ramesh
Ashiqur R. KhudaBukhsh
Sumeet Kumar
28
4
0
17 Feb 2022
Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks
Jingyan Zhou
Deng Jiawen
Fei Mi
Yitong Li
Yasheng Wang
Minlie Huang
Xin Jiang
Qun Liu
Helen Meng
33
31
0
16 Feb 2022
Impact of Pretraining Term Frequencies on Few-Shot Reasoning
Yasaman Razeghi
Robert L Logan IV
Matt Gardner
Sameer Singh
ReLM
LRM
32
150
0
15 Feb 2022
Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content?
P. Schramowski
Christopher Tauchmann
Kristian Kersting
FaML
25
87
0
14 Feb 2022
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations
Yu Meng
Yunyi Zhang
Jiaxin Huang
Yu Zhang
Jiawei Han
56
56
0
09 Feb 2022
Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
Yu Meng
Jiaxin Huang
Yu Zhang
Jiawei Han
SyDa
32
229
0
09 Feb 2022
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
Wei Ping
Ming-Yu Liu
Chaowei Xiao
P. Xu
M. Patwary
M. Shoeybi
Bo-wen Li
Anima Anandkumar
Bryan Catanzaro
31
65
0
08 Feb 2022
Cedille: A large autoregressive French language model
Martin Müller
Florian Laurent
36
19
0
07 Feb 2022
Red Teaming Language Models with Language Models
Ethan Perez
Saffron Huang
Francis Song
Trevor Cai
Roman Ring
John Aslanides
Amelia Glaese
Nat McAleese
G. Irving
AAML
13
611
0
07 Feb 2022
Transformers and the representation of biomedical background knowledge
Oskar Wysocki
Zili Zhou
Paul O'Regan
D. Ferreira
Magdalena Wysocka
Dónal Landers
Idiap Research Institute
MedIm
18
14
0
04 Feb 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
M. Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
93
733
0
28 Jan 2022
Twitter-Demographer: A Flow-based Tool to Enrich Twitter Data
Federico Bianchi
Vincenzo Cutrona
Dirk Hovy
22
4
0
26 Jan 2022
Previous
1
2
3
...
13
14
15
16
Next