Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.01043
Cited By
BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements
1 June 2020
Xiaoyi Chen
A. Salem
Dingfan Chen
Michael Backes
Shiqing Ma
Qingni Shen
Zhonghai Wu
Yang Zhang
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements"
47 / 47 papers shown
Title
ChainMarks: Securing DNN Watermark with Cryptographic Chain
Brian Choi
Shu Wang
Isabelle Choi
Kun Sun
46
0
0
08 May 2025
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
Zihan Wang
Hongwei Li
Rui Zhang
Wenbo Jiang
Kangjie Chen
Tianwei Zhang
Qingchuan Zhao
Jiawei Li
AAML
46
0
0
06 May 2025
A Chaos Driven Metric for Backdoor Attack Detection
Hema Karnam Surendrababu
Nithin Nagaraj
AAML
41
0
0
06 May 2025
Backdoor Attacks Against Patch-based Mixture of Experts
Cedric Chan
Jona te Lintelo
S. Picek
AAML
MoE
175
0
0
03 May 2025
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
Wencong You
Daniel Lowd
39
0
0
24 Apr 2025
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
Gejian Zhao
Hanzhou Wu
Xinpeng Zhang
Athanasios V. Vasilakos
LRM
38
1
0
08 Apr 2025
Poisoned Source Code Detection in Code Models
Ehab Ghannoum
Mohammad Ghafari
AAML
65
0
0
19 Feb 2025
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Huaizhi Ge
Yiming Li
Qifan Wang
Yongfeng Zhang
Ruixiang Tang
AAML
SILM
86
0
0
19 Nov 2024
Backdooring Vision-Language Models with Out-Of-Distribution Data
Weimin Lyu
Jiachen Yao
Saumya Gupta
Lu Pang
Tao Sun
Lingjie Yi
Lijie Hu
Haibin Ling
Chao Chen
VLM
AAML
64
3
0
02 Oct 2024
Context is the Key: Backdoor Attacks for In-Context Learning with Vision Transformers
Gorka Abad
S. Picek
Lorenzo Cavallaro
A. Urbieta
SILM
47
0
0
06 Sep 2024
Defending Code Language Models against Backdoor Attacks with Deceptive Cross-Entropy Loss
Guang Yang
Yu Zhou
Xiang Chen
Xiangyu Zhang
Terry Yue Zhuo
David Lo
Taolue Chen
AAML
57
4
0
12 Jul 2024
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection
Shenao Yan
Shen Wang
Yue Duan
Hanbin Hong
Kiho Lee
Doowon Kim
Yuan Hong
AAML
SILM
43
17
0
10 Jun 2024
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors
Victoria Graf
Qin Liu
Muhao Chen
AAML
37
8
0
02 Apr 2024
OrderBkd: Textual backdoor attack through repositioning
Irina Alekseevskaia
Konstantin Arkhipenko
30
2
0
12 Feb 2024
Comprehensive Assessment of Jailbreak Attacks Against LLMs
Junjie Chu
Yugeng Liu
Ziqing Yang
Xinyue Shen
Michael Backes
Yang Zhang
AAML
37
67
0
08 Feb 2024
Poisoned ChatGPT Finds Work for Idle Hands: Exploring Developers' Coding Practices with Insecure Suggestions from Poisoned AI Models
Sanghak Oh
Kiho Lee
Seonhye Park
Doowon Kim
Hyoungshick Kim
SILM
23
16
0
11 Dec 2023
Efficient Trigger Word Insertion
Yueqi Zeng
Ziqiang Li
Pengfei Xia
Lei Liu
Bin Li
AAML
21
5
0
23 Nov 2023
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Wenjie Mo
Lyne Tchapmi
Qin Liu
Jiong Wang
Jun Yan
Chaowei Xiao
Muhao Chen
Muhao Chen
AAML
61
17
0
16 Nov 2023
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen
Zhenpeng Chen
Michael Backes
Yun Shen
Yang Zhang
SILM
40
249
0
07 Aug 2023
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations
T. Krauß
Alexandra Dmitrienko
AAML
27
4
0
06 Jun 2023
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
Kai Mei
Zheng Li
Zhenting Wang
Yang Zhang
Shiqing Ma
AAML
SILM
37
48
0
28 May 2023
From Shortcuts to Triggers: Backdoor Defense with Denoised PoE
Qin Liu
Fei Wang
Chaowei Xiao
Muhao Chen
AAML
37
22
0
24 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
45
83
0
19 May 2023
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai
Yinpeng Dong
Qingni Shen
Shih-Chieh Pu
Yuejian Fang
Hang Su
32
71
0
07 May 2023
UNICORN: A Unified Backdoor Trigger Inversion Framework
Zhenting Wang
Kai Mei
Juan Zhai
Shiqing Ma
LLMSV
35
44
0
05 Apr 2023
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning
Hritik Bansal
Nishad Singhi
Yu Yang
Fan Yin
Aditya Grover
Kai-Wei Chang
AAML
34
42
0
06 Mar 2023
Harnessing the Speed and Accuracy of Machine Learning to Advance Cybersecurity
Khatoon Mohammed
AAML
23
10
0
24 Feb 2023
Detecting software vulnerabilities using Language Models
Marwan Omar
32
11
0
23 Feb 2023
Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen
Y. Qu
Michael Backes
Yang Zhang
30
32
0
20 Feb 2023
Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-cycle Perspective
Baoyuan Wu
Zihao Zhu
Li Liu
Qingshan Liu
Zhaofeng He
Siwei Lyu
AAML
44
21
0
19 Feb 2023
RobustNLP: A Technique to Defend NLP Models Against Backdoor Attacks
Marwan Omar
SILM
AAML
25
0
0
18 Feb 2023
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Marwan Omar
SILM
AAML
33
20
0
14 Feb 2023
SoK: A Systematic Evaluation of Backdoor Trigger Characteristics in Image Classification
Gorka Abad
Jing Xu
Stefanos Koffas
Behrad Tajalli
S. Picek
Mauro Conti
AAML
63
5
0
03 Feb 2023
BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing
Jiali Wei
Ming Fan
Wenjing Jiao
Wuxia Jin
Ting Liu
AAML
29
11
0
25 Jan 2023
Backdoor Attacks Against Dataset Distillation
Yugeng Liu
Zheng Li
Michael Backes
Yun Shen
Yang Zhang
DD
42
28
0
03 Jan 2023
Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks
Khondoker Murad Hossain
Tim Oates
AAML
26
4
0
15 Dec 2022
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
Lukas Struppek
Dominik Hintersdorf
Kristian Kersting
SILM
22
36
0
04 Nov 2022
Generative Poisoning Using Random Discriminators
Dirren van Vlijmen
A. Kolmus
Zhuoran Liu
Zhengyu Zhao
Martha Larson
26
2
0
02 Nov 2022
Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation
Max Glockner
Yufang Hou
Iryna Gurevych
OffRL
38
38
0
25 Oct 2022
Detecting Backdoors in Deep Text Classifiers
Youyan Guo
Jun Wang
Trevor Cohn
SILM
33
1
0
11 Oct 2022
BadRes: Reveal the Backdoors through Residual Connection
Min He
Tianyu Chen
Haoyi Zhou
Shanghang Zhang
Jianxin Li
24
0
0
15 Sep 2022
Constrained Optimization with Dynamic Bound-scaling for Effective NLPBackdoor Defense
Guangyu Shen
Yingqi Liu
Guanhong Tao
Qiuling Xu
Zhuo Zhang
Shengwei An
Shiqing Ma
Xinming Zhang
AAML
21
34
0
11 Feb 2022
SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders
Tianshuo Cong
Xinlei He
Yang Zhang
21
53
0
27 Jan 2022
Property Inference Attacks Against GANs
Junhao Zhou
Yufei Chen
Chao Shen
Yang Zhang
AAML
MIACV
30
52
0
15 Nov 2021
Get a Model! Model Hijacking Attack Against Machine Learning Models
A. Salem
Michael Backes
Yang Zhang
AAML
15
28
0
08 Nov 2021
Backdoor Learning: A Survey
Yiming Li
Yong Jiang
Zhifeng Li
Shutao Xia
AAML
45
589
0
17 Jul 2020
Dynamic Backdoor Attacks Against Machine Learning Models
A. Salem
Rui Wen
Michael Backes
Shiqing Ma
Yang Zhang
AAML
30
270
0
07 Mar 2020
1