Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.12242
Cited By
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
20 January 2024
Zhen Xiang
Fengqing Jiang
Zidi Xiong
Bhaskar Ramasubramanian
Radha Poovendran
Bo Li
LRM
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models"
19 / 19 papers shown
Title
Safety in Large Reasoning Models: A Survey
Cheng Wang
Yong-Jin Liu
Yangqiu Song
Duzhen Zhang
ZeLin Li
Junfeng Fang
Bryan Hooi
LRM
153
1
0
24 Apr 2025
Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control
Hannah Cyberey
David E. Evans
LLMSV
76
0
0
23 Apr 2025
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
Gejian Zhao
Hanzhou Wu
Xinpeng Zhang
Athanasios V. Vasilakos
LRM
38
1
0
08 Apr 2025
AdvBDGen: Adversarially Fortified Prompt-Specific Fuzzy Backdoor Generator Against LLM Alignment
Pankayaraj Pathmanathan
Udari Madhushani Sehwag
Michael-Andrei Panaitescu-Liess
Furong Huang
SILM
AAML
43
0
0
15 Oct 2024
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
Yuetai Li
Zhangchen Xu
Fengqing Jiang
Luyao Niu
D. Sahabandu
Bhaskar Ramasubramanian
Radha Poovendran
SILM
AAML
56
6
0
18 Jun 2024
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
Wenkai Yang
Xiaohan Bi
Yankai Lin
Sishuo Chen
Jie Zhou
Xu Sun
LLMAG
AAML
44
53
0
17 Feb 2024
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
Shuai Zhao
Jinming Wen
Anh Tuan Luu
J. Zhao
Jie Fu
SILM
62
89
0
02 May 2023
Poisoning Language Models During Instruction Tuning
Alexander Wan
Eric Wallace
Sheng Shen
Dan Klein
SILM
92
124
0
01 May 2023
TrojText: Test-time Invisible Textual Trojan Insertion
Qiang Lou
Ye Liu
Bo Feng
37
23
0
03 Mar 2023
Compositional Semantic Parsing with Large Language Models
Andrew Drozdov
Nathanael Scharli
Ekin Akyuurek
Nathan Scales
Xinying Song
Xinyun Chen
Olivier Bousquet
Denny Zhou
ReLM
LRM
200
92
0
29 Sep 2022
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
325
4,077
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,248
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
367
8,495
0
28 Jan 2022
Detecting Backdoor Attacks Against Point Cloud Classifiers
Zhen Xiang
David J. Miller
Siheng Chen
Xi Li
G. Kesidis
3DPC
AAML
40
15
0
20 Oct 2021
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer
Fanchao Qi
Yangyi Chen
Xurui Zhang
Mukai Li
Zhiyuan Liu
Maosong Sun
AAML
SILM
82
175
0
14 Oct 2021
Subnet Replacement: Deployment-stage backdoor attack against deep neural networks in gray-box setting
Xiangyu Qi
Jifeng Zhu
Chulin Xie
Yong-Liang Yang
AAML
66
35
0
15 Jul 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
250
673
0
06 Jan 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
Clean-Label Backdoor Attacks on Video Recognition Models
Shihao Zhao
Xingjun Ma
Xiang Zheng
James Bailey
Jingjing Chen
Yu-Gang Jiang
AAML
196
274
0
06 Mar 2020
1