ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16955
  4. Cited By
Break it, Imitate it, Fix it: Robustness by Generating Human-Like
  Attacks

Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks

25 October 2023
Aradhana Sinha
Ananth Balashankar
Ahmad Beirami
Thi Avrahami
Jilin Chen
Alex Beutel
    AAML
ArXivPDFHTML

Papers citing "Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks"

22 / 22 papers shown
Title
Certified Robustness Against Natural Language Attacks by Causal
  Intervention
Certified Robustness Against Natural Language Attacks by Causal Intervention
Haiteng Zhao
Chang Ma
Xinshuai Dong
Anh Tuan Luu
Zhi-Hong Deng
Hanwang Zhang
AAML
61
35
0
24 May 2022
Phrase-level Textual Adversarial Attack with Label Preservation
Phrase-level Textual Adversarial Attack with Label Preservation
Yibin Lei
Yu Cao
Dianqi Li
Dinesh Manocha
Meng Fang
Mykola Pechenizkiy
AAML
71
24
0
22 May 2022
Robust Conversational Agents against Imperceptible Toxicity Triggers
Robust Conversational Agents against Imperceptible Toxicity Triggers
Ninareh Mehrabi
Ahmad Beirami
Fred Morstatter
Aram Galstyan
AAML
46
32
0
05 May 2022
Generative Adversarial Networks
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
141
30,021
0
01 Mar 2022
Dynabench: Rethinking Benchmarking in NLP
Dynabench: Rethinking Benchmarking in NLP
Douwe Kiela
Max Bartolo
Yixin Nie
Divyansh Kaushik
Atticus Geiger
...
Pontus Stenetorp
Robin Jia
Joey Tianyi Zhou
Christopher Potts
Adina Williams
130
401
0
07 Apr 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and
  Improving Models
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
Tongshuang Wu
Marco Tulio Ribeiro
Jeffrey Heer
Daniel S. Weld
74
246
0
01 Jan 2021
Explaining NLP Models via Minimal Contrastive Editing (MiCE)
Explaining NLP Models via Minimal Contrastive Editing (MiCE)
Alexis Ross
Ana Marasović
Matthew E. Peters
62
122
0
27 Dec 2020
Neural Deepfake Detection with Factual Structure of Text
Neural Deepfake Detection with Factual Structure of Text
Wanjun Zhong
Duyu Tang
Zenan Xu
Ruize Wang
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
20
62
0
15 Oct 2020
From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks
From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks
Steffen Eger
Yannik Benz
AAML
31
45
0
12 Oct 2020
Plug and Play Language Models: A Simple Approach to Controlled Text
  Generation
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
Sumanth Dathathri
Andrea Madotto
Janice Lan
Jane Hung
Eric Frank
Piero Molino
J. Yosinski
Rosanne Liu
KELM
98
957
0
04 Dec 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
93
991
0
31 Oct 2019
Learning the Difference that Makes a Difference with
  Counterfactually-Augmented Data
Learning the Difference that Makes a Difference with Counterfactually-Augmented Data
Divyansh Kaushik
Eduard H. Hovy
Zachary Chase Lipton
CML
57
567
0
26 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
251
440
0
25 Sep 2019
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on
  Text Classification and Entailment
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
Di Jin
Zhijing Jin
Qiufeng Wang
Peter Szolovits
SILM
AAML
113
1,064
0
27 Jul 2019
Natural Adversarial Examples
Natural Adversarial Examples
Dan Hendrycks
Kevin Zhao
Steven Basart
Jacob Steinhardt
D. Song
OODD
166
1,454
0
16 Jul 2019
PAWS: Paraphrase Adversaries from Word Scrambling
PAWS: Paraphrase Adversaries from Word Scrambling
Yuan Zhang
Jason Baldridge
Luheng He
60
537
0
01 Apr 2019
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence
  Models
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
Paul Michel
Xian Li
Graham Neubig
J. Pino
AAML
50
136
0
15 Mar 2019
Adversarial Risk and the Dangers of Evaluating Against Weak Attacks
Adversarial Risk and the Dangers of Evaluating Against Weak Attacks
J. Uesato
Brendan O'Donoghue
Aaron van den Oord
Pushmeet Kohli
AAML
126
600
0
15 Feb 2018
Black-box Generation of Adversarial Text Sequences to Evade Deep
  Learning Classifiers
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers
Ji Gao
Jack Lanchantin
M. Soffa
Yanjun Qi
AAML
109
716
0
13 Jan 2018
Generating Natural Adversarial Examples
Generating Natural Adversarial Examples
Zhengli Zhao
Dheeru Dua
Sameer Singh
GAN
AAML
138
599
0
31 Oct 2017
Universal adversarial perturbations
Universal adversarial perturbations
Seyed-Mohsen Moosavi-Dezfooli
Alhussein Fawzi
Omar Fawzi
P. Frossard
AAML
110
2,520
0
26 Oct 2016
Intriguing properties of neural networks
Intriguing properties of neural networks
Christian Szegedy
Wojciech Zaremba
Ilya Sutskever
Joan Bruna
D. Erhan
Ian Goodfellow
Rob Fergus
AAML
166
14,831
1
21 Dec 2013
1