Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.06361
Cited By
Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution
11 June 2021
Fanchao Qi
Yuan Yao
Sophia Xu
Zhiyuan Liu
Maosong Sun
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution"
50 / 86 papers shown
Title
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
Wencong You
Daniel Lowd
39
0
0
24 Apr 2025
NLP Security and Ethics, in the Wild
Heather Lent
Erick Galinkin
Yiyi Chen
Jens Myrup Pedersen
Leon Derczynski
Johannes Bjerva
SILM
47
0
0
09 Apr 2025
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Weijun Li
Ansh Arora
Xuanli He
Mark Dras
Qiongkai Xu
AAML
MoMe
53
0
0
08 Apr 2025
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Yao Tong
Weijun Li
Xuanli He
Haolan Zhan
Qiongkai Xu
AAML
48
1
0
31 Dec 2024
Backdoored Retrievers for Prompt Injection Attacks on Retrieval Augmented Generation of Large Language Models
Cody Clop
Yannick Teglia
AAML
SILM
RALM
52
3
0
18 Oct 2024
AdvBDGen: Adversarially Fortified Prompt-Specific Fuzzy Backdoor Generator Against LLM Alignment
Pankayaraj Pathmanathan
Udari Madhushani Sehwag
Michael-Andrei Panaitescu-Liess
Furong Huang
SILM
AAML
43
0
0
15 Oct 2024
Mind Your Questions! Towards Backdoor Attacks on Text-to-Visualization Models
Shuaimin Li
Yuanfeng Song
Xuanang Chen
Anni Peng
Zhuoyue Wan
Chen Jason Zhang
Raymond Chi-Wing Wong
SILM
31
0
0
09 Oct 2024
BadCM: Invisible Backdoor Attack Against Cross-Modal Learning
Zheng Zhang
Xu Yuan
Lei Zhu
Jingkuan Song
Liqiang Nie
AAML
48
11
0
03 Oct 2024
Data-centric NLP Backdoor Defense from the Lens of Memorization
Zhenting Wang
Zhizhi Wang
Mingyu Jin
Mengnan Du
Juan Zhai
Shiqing Ma
35
3
0
21 Sep 2024
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
Rui Zeng
Xi Chen
Yuwen Pu
Xuhong Zhang
Tianyu Du
Shouling Ji
43
2
0
02 Sep 2024
The Dark Side of Human Feedback: Poisoning Large Language Models via User Inputs
Bocheng Chen
Hanqing Guo
Guangjing Wang
Yuanda Wang
Qiben Yan
AAML
44
4
0
01 Sep 2024
Large Language Models are Good Attackers: Efficient and Stealthy Textual Backdoor Attacks
Ziqiang Li
Yueqi Zeng
Pengfei Xia
Lei Liu
Zhangjie Fu
Bin Li
SILM
AAML
55
2
0
21 Aug 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
59
10
0
20 Jul 2024
Turning Generative Models Degenerate: The Power of Data Poisoning Attacks
Shuli Jiang
S. Kadhe
Yi Zhou
Farhan Ahmed
Ling Cai
Nathalie Baracaldo
SILM
AAML
41
4
0
17 Jul 2024
Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers
Terry Tong
Lyne Tchapmi
Qin Liu
Muhao Chen
AAML
SILM
53
1
0
04 Jul 2024
Stealthy Targeted Backdoor Attacks against Image Captioning
Wenshu Fan
Hongwei Li
Wenbo Jiang
Meng Hao
Shui Yu
Xiao Zhang
DiffM
27
6
0
09 Jun 2024
TrojFM: Resource-efficient Backdoor Attacks against Very Large Foundation Models
Yuzhou Nie
Yanting Wang
Jinyuan Jia
Michael J. De Lucia
Nathaniel D. Bastian
Wenbo Guo
Dawn Song
SILM
AAML
38
5
0
27 May 2024
SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks
Xuanli He
Qiongkai Xu
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
AAML
42
4
0
19 May 2024
Exploring Backdoor Vulnerabilities of Chat Models
Yunzhuo Hao
Wenkai Yang
Yankai Lin
SILM
KELM
29
9
0
03 Apr 2024
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors
Victoria Graf
Qin Liu
Muhao Chen
AAML
40
8
0
02 Apr 2024
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Ansh Arora
Xuanli He
Maximilian Mozes
Srinibas Swain
Mark Dras
Qiongkai Xu
SILM
MoMe
AAML
58
13
0
29 Feb 2024
Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs
Xiaoxia Li
Siyuan Liang
Jiyi Zhang
Hansheng Fang
Aishan Liu
Ee-Chien Chang
90
24
0
21 Feb 2024
Backdoor Attack against One-Class Sequential Anomaly Detection Models
He Cheng
Shuhan Yuan
SILM
AAML
27
1
0
15 Feb 2024
Punctuation Matters! Stealthy Backdoor Attack for Language Models
Xuan Sheng
Zhicheng Li
Zhaoyang Han
Xiangmao Chang
Piji Li
43
3
0
26 Dec 2023
Poisoned ChatGPT Finds Work for Idle Hands: Exploring Developers' Coding Practices with Insecure Suggestions from Poisoned AI Models
Sanghak Oh
Kiho Lee
Seonhye Park
Doowon Kim
Hyoungshick Kim
SILM
29
16
0
11 Dec 2023
Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks
Shuli Jiang
S. Kadhe
Yi Zhou
Ling Cai
Nathalie Baracaldo
SILM
AAML
25
13
0
07 Dec 2023
TARGET: Template-Transferable Backdoor Attack Against Prompt-based NLP Models via GPT4
Zihao Tan
Qingliang Chen
Yongjian Huang
Chen Liang
SILM
AAML
42
3
0
29 Nov 2023
Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
Shengwei An
Sheng-Yen Chou
Kaiyuan Zhang
Qiuling Xu
Guanhong Tao
...
Shuyang Cheng
Shiqing Ma
Pin-Yu Chen
Tsung-Yi Ho
Xiangyu Zhang
DiffM
AAML
41
28
0
27 Nov 2023
Efficient Trigger Word Insertion
Yueqi Zeng
Ziqiang Li
Pengfei Xia
Lei Liu
Bin Li
AAML
21
5
0
23 Nov 2023
TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Hengzhi Pei
Jinyuan Jia
Wenbo Guo
Bo-wen Li
Dawn Song
SILM
21
9
0
19 Nov 2023
Attention-Enhancing Backdoor Attacks Against BERT-based Models
Weimin Lyu
Songzhu Zheng
Lu Pang
Haibin Ling
Chao Chen
29
35
0
23 Oct 2023
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Pengzhou Cheng
Zongru Wu
Wei Du
Haodong Zhao
Wei Lu
Gongshen Liu
SILM
AAML
37
18
0
12 Sep 2023
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP
Lu Yan
Zhuo Zhang
Guanhong Tao
Kaiyuan Zhang
Xuan Chen
Guangyu Shen
Xiangyu Zhang
AAML
SILM
62
16
0
04 Aug 2023
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook
Mingyuan Fan
Chengyu Wang
Cen Chen
Yang Liu
Jun Huang
HILM
39
3
0
31 Jul 2023
TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models
Jiaqi Xue
Mengxin Zheng
Ting Hua
Yilin Shen
Ye Liu
Ladislau Bölöni
Qian Lou
41
31
0
12 Jun 2023
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
Kai Mei
Zheng Li
Zhenting Wang
Yang Zhang
Shiqing Ma
AAML
SILM
37
48
0
28 May 2023
Backdooring Neural Code Search
Weisong Sun
Yuchen Chen
Guanhong Tao
Chunrong Fang
Xiangyu Zhang
Quanjun Zhang
Bin Luo
SILM
30
16
0
27 May 2023
IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks
Xuanli He
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
SILM
34
12
0
25 May 2023
From Shortcuts to Triggers: Backdoor Defense with Denoised PoE
Qin Liu
Fei Wang
Chaowei Xiao
Muhao Chen
AAML
37
22
0
24 May 2023
Watermarking Text Data on Large Language Models for Dataset Copyright
Yixin Liu
Hongsheng Hu
Xun Chen
Xuyun Zhang
Lichao Sun
WaLM
21
22
0
22 May 2023
Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation
Xuanli He
Qiongkai Xu
Jun Wang
Benjamin I. P. Rubinstein
Trevor Cohn
AAML
37
18
0
19 May 2023
UOR: Universal Backdoor Attacks on Pre-trained Language Models
Wei Du
Peixuan Li
Bo-wen Li
Haodong Zhao
Gongshen Liu
AAML
39
7
0
16 May 2023
Backdoor Learning on Sequence to Sequence Models
Lichang Chen
Minhao Cheng
Heng-Chiao Huang
SILM
54
18
0
03 May 2023
Defending against Insertion-based Textual Backdoor Attacks via Attribution
Jiazhao Li
Zhuofeng Wu
Ming-Yu Liu
Chaowei Xiao
V. Vydiswaran
48
23
0
03 May 2023
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
Shuai Zhao
Jinming Wen
Anh Tuan Luu
Jun Zhao
Jie Fu
SILM
62
90
0
02 May 2023
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
Jiazhao Li
Yijin Yang
Zhuofeng Wu
V. Vydiswaran
Chaowei Xiao
SILM
67
42
0
27 Apr 2023
Backdoor Attacks with Input-unique Triggers in NLP
Xukun Zhou
Jiwei Li
Tianwei Zhang
Lingjuan Lyu
Muqiao Yang
Jun He
SILM
AAML
30
9
0
25 Mar 2023
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Marwan Omar
SILM
AAML
33
20
0
14 Feb 2023
PECAN: A Deterministic Certified Defense Against Backdoor Attacks
Yuhao Zhang
Aws Albarghouthi
Loris Dántoni
AAML
38
4
0
27 Jan 2023
Stealthy Backdoor Attack for Code Models
Zhou Yang
Bowen Xu
Jie M. Zhang
Hong Jin Kang
Jieke Shi
Junda He
David Lo
AAML
26
65
0
06 Jan 2023
1
2
Next