Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00453
Cited By
v1
v2 (latest)
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
28 February 2021
Timo Schick
Sahana Udupa
Hinrich Schütze
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP"
50 / 256 papers shown
Title
An Empirical Comparison of LM-based Question and Answer Generation Methods
Asahi Ushio
Fernando Alva-Manchego
Jose Camacho-Collados
78
21
0
26 May 2023
An Efficient Multilingual Language Model Compression through Vocabulary Trimming
Asahi Ushio
Yi Zhou
Jose Camacho-Collados
135
8
0
24 May 2023
Trade-Offs Between Fairness and Privacy in Language Modeling
Cleo Matzken
Steffen Eger
Ivan Habernal
SILM
120
6
0
24 May 2023
Debiasing should be Good and Bad: Measuring the Consistency of Debiasing Techniques in Language Models
Robert D Morabito
Jad Kabbara
Ali Emami
52
7
0
23 May 2023
On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection
Fatma Elsafoury
Stamos Katsigiannis
77
1
0
22 May 2023
Word Embeddings Are Steers for Language Models
Chi Han
Jialiang Xu
Manling Li
Yi R. Fung
Chenkai Sun
Nan Jiang
Tarek Abdelzaher
Heng Ji
LLMSV
111
43
0
22 May 2023
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Yue Yu
Yuchen Zhuang
Rongzhi Zhang
Yu Meng
Jiaming Shen
Chao Zhang
VLM
89
37
0
18 May 2023
PaLM 2 Technical Report
Rohan Anil
Andrew M. Dai
Orhan Firat
Melvin Johnson
Dmitry Lepikhin
...
Ce Zheng
Wei Zhou
Denny Zhou
Slav Petrov
Yonghui Wu
ReLM
LRM
269
1,214
0
17 May 2023
On the Origins of Bias in NLP through the Lens of the Jim Code
Fatma Elsafoury
Gavin Abercrombie
98
4
0
16 May 2023
RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
Afra Feyza Akyürek
Ekin Akyürek
Aman Madaan
Ashwin Kalyan
Peter Clark
Derry Wijaya
Niket Tandon
ALM
KELM
114
102
0
15 May 2023
Data Bias Management
Gianluca Demartini
Kevin Roitero
Stefano Mizzaro
139
7
0
15 May 2023
Beyond the Safeguards: Exploring the Security Risks of ChatGPT
Erik Derner
Kristina Batistic
SILM
89
68
0
13 May 2023
Surfacing Biases in Large Language Models using Contrastive Input Decoding
G. Yona
Or Honovich
Itay Laish
Roee Aharoni
65
12
0
12 May 2023
PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives
Silin Gao
Beatriz Borges
B. Su
Stan N. Finkelstein
Saya Kanno
Hiromi Wakaki
Yuki Mitsufuji
Antoine Bosselut
98
20
0
03 May 2023
Generation-driven Contrastive Self-training for Zero-shot Text Classification with Instruction-following LLM
Ruohong Zhang
Yau-Shian Wang
Yiming Yang
SyDa
63
10
0
24 Apr 2023
Effectiveness of Debiasing Techniques: An Indigenous Qualitative Analysis
Vithya Yogarajan
Gillian Dobbie
Henry Gouk
77
3
0
17 Apr 2023
Evaluation of Social Biases in Recent Large Pre-Trained Models
Swapnil Sharma
Nikita Anand
V. KranthiKiranG.
Alind Jain
52
0
0
13 Apr 2023
Toxicity in ChatGPT: Analyzing Persona-assigned Language Models
Ameet Deshpande
Vishvak Murahari
Tanmay Rajpurohit
Ashwin Kalyan
Karthik Narasimhan
LM&MA
LLMAG
106
374
0
11 Apr 2023
ImageCaptioner
2
^2
2
: Image Captioner for Image Captioning Bias Amplification Assessment
Eslam Mohamed Bakr
Pengzhan Sun
Erran L. Li
Mohamed Elhoseiny
58
6
0
10 Apr 2023
Socio-economic landscape of digital transformation & public NLP systems: A critical review
Satyam Mohla
Anupam Guha
86
1
0
04 Apr 2023
Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense
Andrei Kucharavy
Z. Schillaci
Loic Maréchal
Maxime Wursch
Ljiljana Dolamic
Remi Sabonnadiere
Dimitri Percia David
Alain Mermoud
Vincent Lenders
ELM
AI4CE
83
33
0
21 Mar 2023
The Life Cycle of Knowledge in Big Language Models: A Survey
Boxi Cao
Hongyu Lin
Xianpei Han
Le Sun
KELM
95
29
0
14 Mar 2023
Erasing Concepts from Diffusion Models
Rohit Gandikota
Joanna Materzyñska
Jaden Fiotto-Kaufman
David Bau
DiffM
138
313
0
13 Mar 2023
Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence Reasoning
Hongyin Luo
James R. Glass
NAI
59
7
0
10 Mar 2023
Systematic Rectification of Language Models via Dead-end Analysis
Mengyao Cao
Mehdi Fatemi
Jackie C.K. Cheung
Samira Shabanian
KELM
75
16
0
27 Feb 2023
Toward Fairness in Text Generation via Mutual Information Minimization based on Importance Sampling
Rui Wang
Pengyu Cheng
Ricardo Henao
60
12
0
25 Feb 2023
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
Edoardo Ponti
MoMe
OOD
161
80
0
22 Feb 2023
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements
Jiawen Deng
Jiale Cheng
Hao Sun
Zhexin Zhang
Minlie Huang
LM&MA
ELM
95
17
0
18 Feb 2023
The Capacity for Moral Self-Correction in Large Language Models
Deep Ganguli
Amanda Askell
Nicholas Schiefer
Thomas I. Liao
Kamil.e Lukovsiut.e
...
Tom B. Brown
C. Olah
Jack Clark
Sam Bowman
Jared Kaplan
LRM
ReLM
92
171
0
15 Feb 2023
Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models
Shrimai Prabhumoye
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
LM&MA
59
21
0
14 Feb 2023
BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models
Rafal Kocielnik
Shrimai Prabhumoye
Vivian Zhang
Roy Jiang
R. Alvarez
Anima Anandkumar
88
8
0
14 Feb 2023
Towards Agile Text Classifiers for Everyone
Maximilian Mozes
Jessica Hoffmann
Katrin Tomanek
Muhamed Kouate
Nithum Thain
Ann Yuan
Tolga Bolukbasi
Lucas Dixon
103
13
0
13 Feb 2023
Parameter-efficient Modularised Bias Mitigation via AdapterFusion
Deepak Kumar
Oleg Lesota
George Zerveas
Daniel Cohen
Carsten Eickhoff
Markus Schedl
Navid Rekabsaz
MoMe
KELM
91
28
0
13 Feb 2023
Using In-Context Learning to Improve Dialogue Safety
Nicholas Meade
Spandana Gella
Devamanyu Hazarika
Prakhar Gupta
Di Jin
Siva Reddy
Yang Liu
Dilek Z. Hakkani-Tür
127
40
0
02 Feb 2023
Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
60
10
0
28 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
198
36
0
01 Jan 2023
Controllable Text Generation with Language Constraints
Howard Chen
Huihan Li
Danqi Chen
Karthik Narasimhan
67
16
0
20 Dec 2022
Foveate, Attribute, and Rationalize: Towards Physically Safe and Trustworthy AI
Alex Mei
Sharon Levy
William Yang Wang
83
7
0
19 Dec 2022
Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation
Zhexin Zhang
Jiale Cheng
Hao Sun
Jiawen Deng
Fei Mi
Yasheng Wang
Lifeng Shang
Minlie Huang
SILM
159
9
0
04 Dec 2022
Undesirable Biases in NLP: Addressing Challenges of Measurement
Oskar van der Wal
Dominik Bachmann
Alina Leidinger
L. Maanen
Willem H. Zuidema
K. Schulz
94
7
0
24 Nov 2022
AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Weiyan Shi
Emily Dinan
Adithya Renduchintala
Daniel Fried
Athul Paul Jacob
Zhou Yu
M. Lewis
AAML
118
2
0
22 Nov 2022
Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions
Rafal Kocielnik
Sara Kangaslahti
Shrimai Prabhumoye
M. Hari
R. Alvarez
Anima Anandkumar
61
8
0
21 Nov 2022
Conceptor-Aided Debiasing of Large Language Models
Yifei Li
Lyle Ungar
João Sedoc
76
5
0
20 Nov 2022
ADEPT: A DEbiasing PrompT Framework
Ke Yang
Charles Yu
Yi R. Fung
Manling Li
Heng Ji
130
27
0
10 Nov 2022
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
P. Schramowski
Manuel Brack
Bjorn Deiseroth
Kristian Kersting
163
312
0
09 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
Avia Efrat
Or Honovich
Omer Levy
109
20
0
03 Nov 2022
A Robust Bias Mitigation Procedure Based on the Stereotype Content Model
Eddie L. Ungless
Amy Rafferty
Hrichika Nag
Bjorn Ross
72
30
0
26 Oct 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
110
35
0
25 Oct 2022
Differentially Private Language Models for Secure Data Sharing
Justus Mattern
Zhijing Jin
Benjamin Weggenmann
Bernhard Schoelkopf
Mrinmaya Sachan
SyDa
121
52
0
25 Oct 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
338
3,179
0
20 Oct 2022
Previous
1
2
3
4
5
6
Next