Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.16955
Cited By
Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
25 October 2023
Aradhana Sinha
Ananth Balashankar
Ahmad Beirami
Thi Avrahami
Jilin Chen
Alex Beutel
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks"
22 / 22 papers shown
Title
Certified Robustness Against Natural Language Attacks by Causal Intervention
Haiteng Zhao
Chang Ma
Xinshuai Dong
Anh Tuan Luu
Zhi-Hong Deng
Hanwang Zhang
AAML
61
35
0
24 May 2022
Phrase-level Textual Adversarial Attack with Label Preservation
Yibin Lei
Yu Cao
Dianqi Li
Dinesh Manocha
Meng Fang
Mykola Pechenizkiy
AAML
71
24
0
22 May 2022
Robust Conversational Agents against Imperceptible Toxicity Triggers
Ninareh Mehrabi
Ahmad Beirami
Fred Morstatter
Aram Galstyan
AAML
46
32
0
05 May 2022
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
141
30,021
0
01 Mar 2022
Dynabench: Rethinking Benchmarking in NLP
Douwe Kiela
Max Bartolo
Yixin Nie
Divyansh Kaushik
Atticus Geiger
...
Pontus Stenetorp
Robin Jia
Joey Tianyi Zhou
Christopher Potts
Adina Williams
130
401
0
07 Apr 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
Tongshuang Wu
Marco Tulio Ribeiro
Jeffrey Heer
Daniel S. Weld
74
246
0
01 Jan 2021
Explaining NLP Models via Minimal Contrastive Editing (MiCE)
Alexis Ross
Ana Marasović
Matthew E. Peters
62
122
0
27 Dec 2020
Neural Deepfake Detection with Factual Structure of Text
Wanjun Zhong
Duyu Tang
Zenan Xu
Ruize Wang
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
20
62
0
15 Oct 2020
From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks
Steffen Eger
Yannik Benz
AAML
31
45
0
12 Oct 2020
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
Sumanth Dathathri
Andrea Madotto
Janice Lan
Jane Hung
Eric Frank
Piero Molino
J. Yosinski
Rosanne Liu
KELM
98
957
0
04 Dec 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
93
991
0
31 Oct 2019
Learning the Difference that Makes a Difference with Counterfactually-Augmented Data
Divyansh Kaushik
Eduard H. Hovy
Zachary Chase Lipton
CML
57
567
0
26 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
251
440
0
25 Sep 2019
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
Di Jin
Zhijing Jin
Qiufeng Wang
Peter Szolovits
SILM
AAML
113
1,064
0
27 Jul 2019
Natural Adversarial Examples
Dan Hendrycks
Kevin Zhao
Steven Basart
Jacob Steinhardt
D. Song
OODD
166
1,454
0
16 Jul 2019
PAWS: Paraphrase Adversaries from Word Scrambling
Yuan Zhang
Jason Baldridge
Luheng He
60
537
0
01 Apr 2019
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
Paul Michel
Xian Li
Graham Neubig
J. Pino
AAML
50
136
0
15 Mar 2019
Adversarial Risk and the Dangers of Evaluating Against Weak Attacks
J. Uesato
Brendan O'Donoghue
Aaron van den Oord
Pushmeet Kohli
AAML
126
600
0
15 Feb 2018
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers
Ji Gao
Jack Lanchantin
M. Soffa
Yanjun Qi
AAML
109
716
0
13 Jan 2018
Generating Natural Adversarial Examples
Zhengli Zhao
Dheeru Dua
Sameer Singh
GAN
AAML
138
599
0
31 Oct 2017
Universal adversarial perturbations
Seyed-Mohsen Moosavi-Dezfooli
Alhussein Fawzi
Omar Fawzi
P. Frossard
AAML
110
2,520
0
26 Oct 2016
Intriguing properties of neural networks
Christian Szegedy
Wojciech Zaremba
Ilya Sutskever
Joan Bruna
D. Erhan
Ian Goodfellow
Rob Fergus
AAML
166
14,831
1
21 Dec 2013
1