Detection of Word Adversarial Examples in Text Classification: Benchmark
and Baseline via Robust Density Estimation

Detection of Word Adversarial Examples in Text Classification: Benchmark and Baseline via Robust Density Estimation

3 March 2022

Nojun Kwak

ArXiv (abs)PDF HTML

Papers citing "Detection of Word Adversarial Examples in Text Classification: Benchmark and Baseline via Robust Density Estimation"

16 / 16 papers shown

Title
Can Your Uncertainty Scores Detect Hallucinated Entity? Min-Hsuan Yeh Max Kamachee Seongheon Park Yixuan Li HILM 119 3 0 17 Feb 2025
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph Roman Vashurin Ekaterina Fadeeva Artem Vazhentsev Akim Tsvigun Daniil Vasilev ... Timothy Baldwin Timothy Baldwin Maxim Panov Artem Shelmanov Artem Shelmanov HILM 141 28 0 21 Jun 2024
Certifying LLM Safety against Adversarial Prompting Aounon Kumar Chirag Agarwal Suraj Srinivas Aaron Jiaxun Li Soheil Feizi Himabindu Lakkaraju AAML 119 194 0 06 Sep 2023
BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks Yannik Keller J. Mackensen Steffen Eger AAML 107 30 0 02 Jun 2021
A Sweet Rabbit Hole by DARCY: Using Honeypots to Detect Universal Trigger's Adversarial Attacks Thai Le Noseong Park Dongwon Lee 158 24 0 20 Nov 2020
Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples Maximilian Mozes Pontus Stenetorp Bennett Kleinberg Lewis D. Griffin AAML 166 103 0 13 Apr 2020
Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification Yichao Zhou Jyun-Yu Jiang Kai-Wei Chang Wei Wang AAML 63 119 0 06 Sep 2019
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment Di Jin Zhijing Jin Qiufeng Wang Peter Szolovits SILM AAML 199 1,088 0 27 Jul 2019
Combating Adversarial Misspellings with Robust Word Recognition Danish Pruthi Bhuwan Dhingra Zachary Chase Lipton 188 307 0 27 May 2019
Feature Denoising for Improving Adversarial Robustness Cihang Xie Yuxin Wu Laurens van der Maaten Alan Yuille Kaiming He 128 912 0 09 Dec 2018
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks Kimin Lee Kibok Lee Honglak Lee Jinwoo Shin OODD 199 2,063 0 10 Jul 2018
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods Nicholas Carlini D. Wagner AAML 131 1,867 0 20 May 2017
Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks Weilin Xu David Evans Yanjun Qi AAML 97 1,273 0 04 Apr 2017
On the (Statistical) Detection of Adversarial Examples Kathrin Grosse Praveen Manoharan Nicolas Papernot Michael Backes Patrick McDaniel AAML 86 714 0 21 Feb 2017
On Detecting Adversarial Perturbations J. H. Metzen Tim Genewein Volker Fischer Bastian Bischoff AAML 80 950 0 14 Feb 2017
Convolutional Neural Networks for Sentence Classification Yoon Kim AILaw VLM 644 13,432 0 25 Aug 2014