ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.05887
  4. Cited By
Frequency-Guided Word Substitutions for Detecting Textual Adversarial
  Examples
v1v2 (latest)

Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples

13 April 2020
Maximilian Mozes
Pontus Stenetorp
Bennett Kleinberg
Lewis D. Griffin
    AAML
ArXiv (abs)PDFHTML

Papers citing "Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples"

46 / 46 papers shown
Title
Exploring Gradient-Guided Masked Language Model to Detect Textual Adversarial Attacks
Exploring Gradient-Guided Masked Language Model to Detect Textual Adversarial Attacks
Xiaomei Zhang
Zhaoxi Zhang
Yanjun Zhang
Xufei Zheng
L. Zhang
Shengshan Hu
Shirui Pan
AAML
58
0
0
08 Apr 2025
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
Hao Fang
Jiawei Kong
Wenbo Yu
Bin Chen
Jiawei Li
Hao Wu
Ke Xu
Ke Xu
AAMLVLM
131
13
0
08 Jun 2024
Reversible Jump Attack to Textual Classifiers with Modification
  Reduction
Reversible Jump Attack to Textual Classifiers with Modification Reduction
Mingze Ni
Zhensu Sun
Wei Liu
AAML
56
0
0
21 Mar 2024
RoCoIns: Enhancing Robustness of Large Language Models through
  Code-Style Instructions
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Yuan Zhang
Xiao Wang
Zhiheng Xi
Han Xia
Tao Gui
Qi Zhang
Xuanjing Huang
88
4
0
26 Feb 2024
Fast Adversarial Training against Textual Adversarial Attacks
Fast Adversarial Training against Textual Adversarial Attacks
Yichen Yang
Xin Liu
Kun He
AAML
47
4
0
23 Jan 2024
ROIC-DM: Robust Text Inference and Classification via Diffusion Model
ROIC-DM: Robust Text Inference and Classification via Diffusion Model
Shilong Yuan
Wei Yuan
Hongzhi Yin
Tieke He
DiffM
96
3
0
07 Jan 2024
Improving the Robustness of Transformer-based Large Language Models with
  Dynamic Attention
Improving the Robustness of Transformer-based Large Language Models with Dynamic Attention
Lujia Shen
Yuwen Pu
Shouling Ji
Changjiang Li
Xuhong Zhang
Chunpeng Ge
Ting Wang
AAML
69
6
0
29 Nov 2023
Toward Stronger Textual Attack Detectors
Toward Stronger Textual Attack Detectors
Pierre Colombo
Marine Picot
Nathan Noiry
Guillaume Staerman
Pablo Piantanida
561
5
0
21 Oct 2023
The Trickle-down Impact of Reward (In-)consistency on RLHF
The Trickle-down Impact of Reward (In-)consistency on RLHF
Lingfeng Shen
Sihao Chen
Linfeng Song
Lifeng Jin
Baolin Peng
Haitao Mi
Daniel Khashabi
Dong Yu
93
23
0
28 Sep 2023
What Learned Representations and Influence Functions Can Tell Us About
  Adversarial Examples
What Learned Representations and Influence Functions Can Tell Us About Adversarial Examples
Shakila Mahjabin Tonni
Mark Dras
TDIAAMLGAN
60
0
0
19 Sep 2023
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and
  Vulnerabilities
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes
Xuanli He
Bennett Kleinberg
Lewis D. Griffin
87
87
0
24 Aug 2023
Text-CRS: A Generalized Certified Robustness Framework against Textual
  Adversarial Attacks
Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks
Xinyu Zhang
Hanbin Hong
Yuan Hong
Peng Huang
Binghui Wang
Zhongjie Ba
Kui Ren
SILM
129
25
0
31 Jul 2023
Interpretability and Transparency-Driven Detection and Transformation of
  Textual Adversarial Examples (IT-DT)
Interpretability and Transparency-Driven Detection and Transformation of Textual Adversarial Examples (IT-DT)
Bushra Sabir
Muhammad Ali Babar
Sharif Abuadbba
SILM
74
10
0
03 Jul 2023
On the Universal Adversarial Perturbations for Efficient Data-free
  Adversarial Detection
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection
Songyang Gao
Shihan Dou
Qi Zhang
Xuanjing Huang
Jin Ma
Yingchun Shan
AAML
55
3
0
27 Jun 2023
VoteTRANS: Detecting Adversarial Text without Training by Voting on Hard
  Labels of Transformations
VoteTRANS: Detecting Adversarial Text without Training by Voting on Hard Labels of Transformations
Hoang-Quoc Nguyen-Son
Seira Hidano
Kazuhide Fukushima
S. Kiyomoto
Isao Echizen
57
0
0
02 Jun 2023
From Adversarial Arms Race to Model-centric Evaluation: Motivating a
  Unified Automatic Robustness Evaluation Framework
From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework
Yangyi Chen
Hongcheng Gao
Ganqu Cui
Lifan Yuan
Dehan Kong
...
Longtao Huang
H. Xue
Zhiyuan Liu
Maosong Sun
Heng Ji
AAMLELM
101
6
0
29 May 2023
The Best Defense is Attack: Repairing Semantics in Textual Adversarial
  Examples
The Best Defense is Attack: Repairing Semantics in Textual Adversarial Examples
Heng Yang
Ke Li
AAML
114
3
0
06 May 2023
Masked Language Model Based Textual Adversarial Example Detection
Masked Language Model Based Textual Adversarial Example Detection
Xiaomei Zhang
Zhaoxi Zhang
Qi Zhong
Xufei Zheng
Yanjun Zhang
Shengshan Hu
L. Zhang
AAML
101
2
0
18 Apr 2023
TextDefense: Adversarial Text Detection based on Word Importance Entropy
TextDefense: Adversarial Text Detection based on Word Importance Entropy
Lujia Shen
Xuhong Zhang
S. Ji
Yuwen Pu
Chunpeng Ge
Xing Yang
Yanghe Feng
AAML
59
8
0
12 Feb 2023
Less is More: Understanding Word-level Textual Adversarial Attack via
  n-gram Frequency Descend
Less is More: Understanding Word-level Textual Adversarial Attack via n-gram Frequency Descend
Ning Lu
Shengcai Liu
Zhirui Zhang
Qi. Wang
Haifeng Liu
Jiaheng Zhang
AAML
152
8
0
06 Feb 2023
TextShield: Beyond Successfully Detecting Adversarial Sentences in Text
  Classification
TextShield: Beyond Successfully Detecting Adversarial Sentences in Text Classification
Lingfeng Shen
Ze Zhang
Haiyun Jiang
Ying-Cong Chen
AAML
113
5
0
03 Feb 2023
Disentangled Text Representation Learning with Information-Theoretic
  Perspective for Adversarial Robustness
Disentangled Text Representation Learning with Information-Theoretic Perspective for Adversarial Robustness
Jiahao Zhao
Wenji Mao
DRLOOD
61
3
0
26 Oct 2022
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and
  Model Uncertainty Estimation
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation
Fan Yin
Yao Li
Cho-Jui Hsieh
Kai-Wei Chang
AAML
93
4
0
22 Oct 2022
TCAB: A Large-Scale Text Classification Attack Benchmark
TCAB: A Large-Scale Text Classification Attack Benchmark
Kalyani Asthana
Zhouhang Xie
Wencong You
Adam Noack
Jonathan Brophy
Sameer Singh
Daniel Lowd
119
3
0
21 Oct 2022
Identifying Human Strategies for Generating Word-Level Adversarial
  Examples
Identifying Human Strategies for Generating Word-Level Adversarial Examples
Maximilian Mozes
Bennett Kleinberg
Lewis D. Griffin
AAML
116
2
0
20 Oct 2022
Why Should Adversarial Perturbations be Imperceptible? Rethink the
  Research Paradigm in Adversarial NLP
Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP
Yangyi Chen
Hongcheng Gao
Ganqu Cui
Fanchao Qi
Longtao Huang
Zhiyuan Liu
Maosong Sun
SILM
62
56
0
19 Oct 2022
Textwash -- automated open-source text anonymisation
Textwash -- automated open-source text anonymisation
Bennett Kleinberg
Toby P Davies
Maximilian Mozes
64
13
0
27 Aug 2022
Rethinking Textual Adversarial Defense for Pre-trained Language Models
Rethinking Textual Adversarial Defense for Pre-trained Language Models
Jiayi Wang
Rongzhou Bao
Zhuosheng Zhang
Hai Zhao
AAMLSILM
56
11
0
21 Jul 2022
Towards Explainability in NLP: Analyzing and Calculating Word Saliency
  through Word Properties
Towards Explainability in NLP: Analyzing and Calculating Word Saliency through Word Properties
Jialiang Dong
Zhitao Guan
Longfei Wu
Zijian Zhang
Xiaojiang Du
XAIAAMLFAttMILM
88
2
0
17 Jul 2022
Detecting Textual Adversarial Examples Based on Distributional
  Characteristics of Data Representations
Detecting Textual Adversarial Examples Based on Distributional Characteristics of Data Representations
Na Liu
Mark Dras
Wei Emma Zhang
AAML
44
6
0
29 Apr 2022
Residue-Based Natural Language Adversarial Attack Detection
Residue-Based Natural Language Adversarial Attack Detection
Vyas Raina
Mark Gales
AAML
72
12
0
17 Apr 2022
"That Is a Suspicious Reaction!": Interpreting Logits Variation to
  Detect NLP Adversarial Attacks
"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks
Edoardo Mosca
Shreyash Agarwal
Javier Rando
Georg Groh
AAML
95
31
0
10 Apr 2022
Adversarial Training for Improving Model Robustness? Look at Both
  Prediction and Interpretation
Adversarial Training for Improving Model Robustness? Look at Both Prediction and Interpretation
Hanjie Chen
Yangfeng Ji
OODAAMLVLM
101
21
0
23 Mar 2022
Input-specific Attention Subnetworks for Adversarial Detection
Input-specific Attention Subnetworks for Adversarial Detection
Emil Biju
Anirudh Sriram
Pratyush Kumar
Mitesh M Khapra
AAML
40
5
0
23 Mar 2022
A Prompting-based Approach for Adversarial Example Generation and
  Robustness Enhancement
A Prompting-based Approach for Adversarial Example Generation and Robustness Enhancement
Yuting Yang
Pei Huang
Juan Cao
Jintao Li
Yun Lin
Jin Song Dong
Feifei Ma
Jian Zhang
AAMLSILM
96
13
0
21 Mar 2022
Detection of Word Adversarial Examples in Text Classification: Benchmark
  and Baseline via Robust Density Estimation
Detection of Word Adversarial Examples in Text Classification: Benchmark and Baseline via Robust Density Estimation
Kiyoon Yoo
Jangho Kim
Jiho Jang
Nojun Kwak
225
41
0
03 Mar 2022
Identifying Adversarial Attacks on Text Classifiers
Identifying Adversarial Attacks on Text Classifiers
Zhouhang Xie
Jonathan Brophy
Adam Noack
Wencong You
Kalyani Asthana
Carter Perkins
Sabrina Reis
Sameer Singh
Daniel Lowd
AAML
84
10
0
21 Jan 2022
Detecting Textual Adversarial Examples through Randomized Substitution
  and Vote
Detecting Textual Adversarial Examples through Randomized Substitution and Vote
Xiaosen Wang
Yifeng Xiong
Kun He
AAML
59
11
0
13 Sep 2021
TREATED:Towards Universal Defense against Textual Adversarial Attacks
TREATED:Towards Universal Defense against Textual Adversarial Attacks
Bin Zhu
Zhaoquan Gu
Le Wang
Zhihong Tian
AAML
45
8
0
13 Sep 2021
Contrasting Human- and Machine-Generated Word-Level Adversarial Examples
  for Text Classification
Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification
Maximilian Mozes
Max Bartolo
Pontus Stenetorp
Bennett Kleinberg
Lewis D. Griffin
DeLMOAAMLSILM
47
7
0
09 Sep 2021
Defending Pre-trained Language Models from Adversarial Word
  Substitutions Without Performance Sacrifice
Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance Sacrifice
Rongzhou Bao
Jiayi Wang
Hai Zhao
AAML
56
43
0
30 May 2021
Adversarial Examples Detection with Bayesian Neural Network
Adversarial Examples Detection with Bayesian Neural Network
Yao Li
Tongyi Tang
Cho-Jui Hsieh
T. C. Lee
GANAAML
58
3
0
18 May 2021
Achieving Model Robustness through Discrete Adversarial Training
Achieving Model Robustness through Discrete Adversarial Training
Maor Ivgi
Jonathan Berant
AAML
71
28
0
11 Apr 2021
Enhancing Pre-trained Language Model with Lexical Simplification
Enhancing Pre-trained Language Model with Lexical Simplification
Rongzhou Bao
Jiayi Wang
Zhuosheng Zhang
Hai Zhao
39
2
0
30 Dec 2020
SHIELD: Defending Textual Neural Networks against Multiple Black-Box
  Adversarial Attacks with Stochastic Multi-Expert Patcher
SHIELD: Defending Textual Neural Networks against Multiple Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher
Thai Le
Noseong Park
Dongwon Lee
AAML
46
21
0
17 Nov 2020
Manipulating emotions for ground truth emotion analysis
Manipulating emotions for ground truth emotion analysis
Bennett Kleinberg
20
2
0
16 Jun 2020
1