ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.11462
  4. Cited By
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
  Models
v1v2 (latest)

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models

24 September 2020
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
ArXiv (abs)PDFHTML

Papers citing "RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models"

50 / 814 papers shown
Title
Galactica: A Large Language Model for Science
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELMReLM
131
786
0
16 Nov 2022
kogito: A Commonsense Knowledge Inference Toolkit
kogito: A Commonsense Knowledge Inference Toolkit
Mete Ismayilzada
Antoine Bosselut
71
7
0
15 Nov 2022
The CRINGE Loss: Learning what language not to model
The CRINGE Loss: Learning what language not to model
Leonard Adolphs
Tianyu Gao
Jing Xu
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
MU
95
37
0
10 Nov 2022
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in
  Diffusion Models
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
P. Schramowski
Manuel Brack
Bjorn Deiseroth
Kristian Kersting
161
312
0
09 Nov 2022
Tuning Language Models as Training Data Generators for
  Augmentation-Enhanced Few-Shot Learning
Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
Yu Meng
Martin Michalski
Jiaxin Huang
Yu Zhang
Tarek Abdelzaher
Jiawei Han
VLM
122
49
0
06 Nov 2022
Generating Sequences by Learning to Self-Correct
Generating Sequences by Learning to Self-Correct
Sean Welleck
Ximing Lu
Peter West
Faeze Brahman
T. Shen
Daniel Khashabi
Yejin Choi
LRM
111
238
0
31 Oct 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for
  Text Generation and Modular Control
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
167
91
0
31 Oct 2022
Nearest Neighbor Language Models for Stylistic Controllable Generation
Nearest Neighbor Language Models for Stylistic Controllable Generation
Severino Trotta
Lucie Flek
Charles F Welch
89
4
0
27 Oct 2022
SentBS: Sentence-level Beam Search for Controllable Summarization
SentBS: Sentence-level Beam Search for Controllable Summarization
Chenhui Shen
Liying Cheng
Lidong Bing
Yang You
Luo Si
122
11
0
26 Oct 2022
Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?
Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste Döderlein
Nguessan Hermann Kouadio
M. Acher
D. Khelladi
B. Combemale
92
36
0
26 Oct 2022
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer
  Data Augmentation
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation
Phillip Howard
Gadi Singer
Vasudev Lal
Yejin Choi
Swabha Swayamdipta
CML
118
25
0
22 Oct 2022
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLMLRM
320
3,178
0
20 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining
  Perspective
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
121
45
0
19 Oct 2022
Language Detoxification with Attribute-Discriminative Latent Space
Language Detoxification with Attribute-Discriminative Latent Space
Jin Myung Kwak
Minseon Kim
Sung Ju Hwang
70
14
0
19 Oct 2022
DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for
  Controllable Text Generation
DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation
Hanqing Zhang
Dawei Song
89
38
0
18 Oct 2022
Deep Bidirectional Language-Knowledge Graph Pretraining
Deep Bidirectional Language-Knowledge Graph Pretraining
Michihiro Yasunaga
Antoine Bosselut
Hongyu Ren
Xikun Zhang
Christopher D. Manning
Percy Liang
J. Leskovec
101
205
0
17 Oct 2022
Prompting GPT-3 To Be Reliable
Prompting GPT-3 To Be Reliable
Chenglei Si
Zhe Gan
Zhengyuan Yang
Shuohang Wang
Jianfeng Wang
Jordan L. Boyd-Graber
Lijuan Wang
KELMLRM
115
303
0
17 Oct 2022
Keep Me Updated! Memory Management in Long-term Conversations
Keep Me Updated! Memory Management in Long-term Conversations
Sanghwan Bae
Donghyun Kwak
Soyoung Kang
Min Young Lee
Sungdong Kim
Yuin Jeong
Hyeri Kim
Sang-Woo Lee
W. Park
Nako Sung
118
52
0
17 Oct 2022
Language Generation Models Can Cause Harm: So What Can We Do About It?
  An Actionable Survey
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
189
91
0
14 Oct 2022
Language Model Decoding as Likelihood-Utility Alignment
Language Model Decoding as Likelihood-Utility Alignment
Martin Josifoski
Maxime Peyrard
Frano Rajic
Jiheng Wei
Debjit Paul
...
Barun Patra
Vishrav Chaudhary
Emre Kıcıman
Boi Faltings
Robert West
82
5
0
13 Oct 2022
Unified Detoxifying and Debiasing in Language Generation via
  Inference-time Adaptive Optimization
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization
Zonghan Yang
Xiaoyuan Yi
Peng Li
Yang Liu
Xing Xie
119
34
0
10 Oct 2022
An Analysis of the Effects of Decoding Algorithms on Fairness in
  Open-Ended Language Generation
An Analysis of the Effects of Decoding Algorithms on Fairness in Open-Ended Language Generation
Jwala Dhamala
Varun Kumar
Rahul Gupta
Kai-Wei Chang
Aram Galstyan
66
7
0
07 Oct 2022
Prompt Compression and Contrastive Conditioning for Controllability and
  Toxicity Reduction in Language Models
Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models
David Wingate
Mohammad Shoeybi
Taylor Sorensen
93
78
0
06 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDLLRM
394
1,103
0
05 Oct 2022
When to Make Exceptions: Exploring Language Models as Accounts of Human
  Moral Judgment
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
Zhijing Jin
Sydney Levine
Fernando Gonzalez
Ojasv Kamal
Maarten Sap
Mrinmaya Sachan
Rada Mihalcea
J. Tenenbaum
Bernhard Schölkopf
ELMLRM
105
103
0
04 Oct 2022
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Kundan Krishna
Saurabh Garg
Jeffrey P. Bigham
Zachary Chase Lipton
108
33
0
28 Sep 2022
Will It Blend? Mixing Training Paradigms & Prompting for Argument
  Quality Prediction
Will It Blend? Mixing Training Paradigms & Prompting for Argument Quality Prediction
Michiel van der Meer
Myrthe Reuver
Urja Khurana
Lea Krause
Selene Báez Santamaría
78
14
0
19 Sep 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Rohan Taori
Tatsunori B. Hashimoto
135
48
0
08 Sep 2022
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain
  Chatbots
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Waiman Si
Michael Backes
Jeremy Blackburn
Emiliano De Cristofaro
Gianluca Stringhini
Savvas Zannettou
Yang Zhang
98
68
0
07 Sep 2022
Foundations and Trends in Multimodal Machine Learning: Principles,
  Challenges, and Open Questions
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
114
90
0
07 Sep 2022
Elaboration-Generating Commonsense Question Answering at Scale
Elaboration-Generating Commonsense Question Answering at Scale
Wenya Wang
Vivek Srikumar
Hannaneh Hajishirzi
Noah A. Smith
ELMLRM
76
15
0
02 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
325
489
0
23 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the
  Perspective of Digital Deception
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
82
3
0
11 Aug 2022
Social Simulacra: Creating Populated Prototypes for Social Computing
  Systems
Social Simulacra: Creating Populated Prototypes for Social Computing Systems
J. Park
Lindsay Popowski
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
87
299
0
08 Aug 2022
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language
  Models
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
Margaret Li
Suchin Gururangan
Tim Dettmers
M. Lewis
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoMe
110
154
0
05 Aug 2022
A Holistic Approach to Undesired Content Detection in the Real World
A Holistic Approach to Undesired Content Detection in the Real World
Todor Markov
Chong Zhang
Sandhini Agarwal
Tyna Eloundou
Teddy Lee
Steven Adler
Angela Jiang
L. Weng
125
237
0
05 Aug 2022
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq
  Model
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Saleh Soltan
Shankar Ananthakrishnan
Jack G. M. FitzGerald
Rahul Gupta
Wael Hamza
...
Mukund Sridhar
Fabian Triefenbach
Apurv Verma
Gokhan Tur
Premkumar Natarajan
135
83
0
02 Aug 2022
ELF22: A Context-based Counter Trolling Dataset to Combat Internet
  Trolls
ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls
Huije Lee
Young Ju Na
Hoyun Song
Jisu Shin
Jong C. Park
38
8
0
30 Jul 2022
Zero-Shot Video Captioning with Evolving Pseudo-Tokens
Zero-Shot Video Captioning with Evolving Pseudo-Tokens
Yoad Tewel
Yoav Shalev
Roy Nadler
Idan Schwartz
Lior Wolf
70
27
0
22 Jul 2022
Democratizing Ethical Assessment of Natural Language Generation Models
Democratizing Ethical Assessment of Natural Language Generation Models
A. Rasekh
Ian W. Eisenberg
ELM
53
1
0
30 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
  Knowledge
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
152
388
0
17 Jun 2022
Characteristics of Harmful Text: Towards Rigorous Benchmarking of
  Language Models
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Maribeth Rauh
John F. J. Mellor
J. Uesato
Po-Sen Huang
Johannes Welbl
...
Amelia Glaese
G. Irving
Iason Gabriel
William S. Isaac
Lisa Anne Hendricks
126
52
0
16 Jun 2022
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
Kushal Arora
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
VLM
98
41
0
15 Jun 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELMReLMLRM
322
2,526
0
15 Jun 2022
Text Generation with Text-Editing Models
Text Generation with Text-Editing Models
Eric Malmi
Yue Dong
Jonathan Mallinson
A. Chuklin
Jakub Adamek
Daniil Mirylenka
Felix Stahlberg
Sebastian Krause
Shankar Kumar
Aliaksei Severyn
KELM
64
26
0
14 Jun 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
179
220
0
26 May 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
ProsocialDialog: A Prosocial Backbone for Conversational Agents
Hyunwoo J. Kim
Youngjae Yu
Liwei Jiang
Ximing Lu
Daniel Khashabi
Gunhee Kim
Yejin Choi
Maarten Sap
112
128
0
25 May 2022
Gradient-Based Constrained Sampling from Language Models
Gradient-Based Constrained Sampling from Language Models
Sachin Kumar
Biswajit Paria
Yulia Tsvetkov
BDL
99
57
0
25 May 2022
Challenges in Measuring Bias via Open-Ended Language Generation
Challenges in Measuring Bias via Open-Ended Language Generation
Afra Feyza Akyürek
Muhammed Yusuf Kocyigit
Sejin Paik
Derry Wijaya
71
26
0
23 May 2022
RL with KL penalties is better viewed as Bayesian inference
RL with KL penalties is better viewed as Bayesian inference
Tomasz Korbak
Ethan Perez
Christopher L. Buckley
OffRL
96
77
0
23 May 2022
Previous
123...1314151617
Next