Perturbations in the Wild: Leveraging Human-Written Text Perturbations
for Realistic Adversarial Attack and Defense

Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense

19 March 2022

ArXiv (abs)PDF HTML

Papers citing "Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense"

13 / 13 papers shown

Title
ASSERT: Automated Safety Scenario Red Teaming for Evaluating the Robustness of Large Language Models Alex Mei Sharon Levy William Y. Wang AAML 107 9 0 14 Oct 2023
MSAC: Multiple Speech Attribute Control Method for Reliable Speech Emotion Recognition Yu Pan Yuguang Yang Yuheng Huang Jixun Yao Jingjing Yin Yanni Hu Heng Lu Lei Ma Jianjun Zhao 90 6 0 08 Aug 2023
Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks Hongcheng Gao Hao Zhang Yinpeng Dong Zhijie Deng AAML 109 23 0 16 Jun 2023
Disinformation 2.0 in the Age of AI: A Cybersecurity Perspective W. Mazurczyk Dongwon Lee Andreas Vlachos 50 9 0 08 Jun 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations Lifan Yuan Yangyi Chen Ganqu Cui Hongcheng Gao Fangyuan Zou Xingyi Cheng Heng Ji Zhiyuan Liu Maosong Sun 144 84 0 07 Jun 2023
From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework Yangyi Chen Hongcheng Gao Ganqu Cui Lifan Yuan Dehan Kong ... Longtao Huang H. Xue Zhiyuan Liu Maosong Sun Heng Ji AAML ELM 101 6 0 29 May 2023
Dynamic Transformers Provide a False Sense of Efficiency Yiming Chen Simin Chen Zexin Li Wei Yang Cong Liu R. Tan Haizhou Li AAML 90 12 0 20 May 2023
White-Box Multi-Objective Adversarial Attack on Dialogue Generation Yufei Li Zexin Li Ying Gao Cong Liu AAML 58 12 0 05 May 2023
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models Yiran Ye Thai Le Dongwon Lee AAML DeLMO 67 3 0 18 Mar 2023
Learning the Legibility of Visual Text Perturbations D. Seth Rickard Stureborg Danish Pruthi Bhuwan Dhingra AAML 73 7 0 09 Mar 2023
TextShield: Beyond Successfully Detecting Adversarial Sentences in Text Classification Lingfeng Shen Ze Zhang Haiyun Jiang Ying-Cong Chen AAML 113 5 0 03 Feb 2023
PSSAT: A Perturbed Semantic Structure Awareness Transferring Method for Perturbation-Robust Slot Filling Guanting Dong Daichi Guo Liwen Wang Xuefeng Li Zechen Wang ... Hao Lei Xinyue Cui Yi Huang Junlan Feng Weiran Xu 70 12 0 24 Aug 2022
Token-Modification Adversarial Attacks for Natural Language Processing: A Survey Tom Roth Yansong Gao A. Abuadbba Surya Nepal Wei Liu AAML 106 12 0 01 Mar 2021