ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09527
  4. Cited By
Ignore Previous Prompt: Attack Techniques For Language Models

Ignore Previous Prompt: Attack Techniques For Language Models

17 November 2022
Fábio Perez
Ian Ribeiro
    SILM
ArXivPDFHTML

Papers citing "Ignore Previous Prompt: Attack Techniques For Language Models"

50 / 284 papers shown
Title
Exploring Autonomous Agents through the Lens of Large Language Models: A
  Review
Exploring Autonomous Agents through the Lens of Large Language Models: A Review
Saikat Barua
LM&MA
LLMAG
33
15
0
05 Apr 2024
Vocabulary Attack to Hijack Large Language Model Applications
Vocabulary Attack to Hijack Large Language Model Applications
Patrick Levi
Christoph P. Neumann
AAML
24
8
0
03 Apr 2024
Exploring the Privacy Protection Capabilities of Chinese Large Language
  Models
Exploring the Privacy Protection Capabilities of Chinese Large Language Models
Yuqi Yang
Xiaowen Huang
Jitao Sang
ELM
PILM
AILaw
49
1
0
27 Mar 2024
Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of
  Large Language Models
Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models
Zhiyuan Yu
Xiaogeng Liu
Shunning Liang
Zach Cameron
Chaowei Xiao
Ning Zhang
30
40
0
26 Mar 2024
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Jiawen Shi
Zenghui Yuan
Yinuo Liu
Yue Huang
Pan Zhou
Lichao Sun
Neil Zhenqiang Gong
AAML
45
39
0
26 Mar 2024
Risk and Response in Large Language Models: Evaluating Key Threat
  Categories
Risk and Response in Large Language Models: Evaluating Key Threat Categories
Bahareh Harandizadeh
A. Salinas
Fred Morstatter
25
3
0
22 Mar 2024
Securing Large Language Models: Threats, Vulnerabilities and Responsible
  Practices
Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices
Sara Abdali
Richard Anarfi
C. Barberan
Jia He
PILM
70
24
0
19 Mar 2024
Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for
  Language Models
Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models
Yi Luo
Zheng-Wen Lin
Yuhao Zhang
Jiashuo Sun
Chen Lin
Chengjin Xu
Xiangdong Su
Yelong Shen
Jian Guo
Yeyun Gong
LM&MA
ELM
ALM
AI4TS
30
1
0
18 Mar 2024
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Egor Zverev
Sahar Abdelnabi
Soroush Tabesh
Mario Fritz
Christoph H. Lampert
56
19
0
11 Mar 2024
Automatic and Universal Prompt Injection Attacks against Large Language
  Models
Automatic and Universal Prompt Injection Attacks against Large Language Models
Xiaogeng Liu
Zhiyuan Yu
Yizhe Zhang
Ning Zhang
Chaowei Xiao
SILM
AAML
46
33
0
07 Mar 2024
Neural Exec: Learning (and Learning from) Execution Triggers for Prompt
  Injection Attacks
Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks
Dario Pasquini
Martin Strohmeier
Carmela Troncoso
AAML
34
21
0
06 Mar 2024
Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications
Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications
Stav Cohen
Ron Bitton
Ben Nassi
38
18
0
05 Mar 2024
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated
  Large Language Model Agents
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Qiusi Zhan
Zhixiang Liang
Zifan Ying
Daniel Kang
LLMAG
46
73
0
05 Mar 2024
PRSA: PRompt Stealing Attacks against Large Language Models
PRSA: PRompt Stealing Attacks against Large Language Models
Yong Yang
Changjiang Li
Yi Jiang
Xi Chen
Haoyu Wang
Xuhong Zhang
Zonghui Wang
Shouling Ji
SILM
AAML
36
1
0
29 Feb 2024
A New Era in LLM Security: Exploring Security Concerns in Real-World
  LLM-based Systems
A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems
Fangzhou Wu
Ning Zhang
Somesh Jha
P. McDaniel
Chaowei Xiao
34
68
0
28 Feb 2024
Making Them Ask and Answer: Jailbreaking Large Language Models in Few
  Queries via Disguise and Reconstruction
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction
Tong Liu
Yingjie Zhang
Zhe Zhao
Yinpeng Dong
Guozhu Meng
Kai Chen
AAML
51
44
0
28 Feb 2024
Exploring Advanced Methodologies in Security Evaluation for LLMs
Exploring Advanced Methodologies in Security Evaluation for LLMs
Junming Huang
Jiawei Zhang
Qi Wang
Weihong Han
Yanchun Zhang
45
0
0
28 Feb 2024
Follow My Instruction and Spill the Beans: Scalable Data Extraction from
  Retrieval-Augmented Generation Systems
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
Zhenting Qi
Hanlin Zhang
Eric Xing
Sham Kakade
Hima Lakkaraju
SILM
44
18
0
27 Feb 2024
WIPI: A New Web Threat for LLM-Driven Web Agents
WIPI: A New Web Threat for LLM-Driven Web Agents
Fangzhou Wu
Shutong Wu
Yulong Cao
Chaowei Xiao
LLMAG
34
18
0
26 Feb 2024
Evaluating Robustness of Generative Search Engine on Adversarial Factual
  Questions
Evaluating Robustness of Generative Search Engine on Adversarial Factual Questions
Xuming Hu
Xiaochuan Li
Junzhe Chen
Hai-Tao Zheng
Yangning Li
...
Yasheng Wang
Qun Liu
Lijie Wen
Philip S. Yu
Zhijiang Guo
AAML
ELM
27
5
0
25 Feb 2024
PRP: Propagating Universal Perturbations to Attack Large Language Model
  Guard-Rails
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
Neal Mangaokar
Ashish Hooda
Jihye Choi
Shreyas Chandrashekaran
Kassem Fawaz
Somesh Jha
Atul Prakash
AAML
32
35
0
24 Feb 2024
Fast Adversarial Attacks on Language Models In One GPU Minute
Fast Adversarial Attacks on Language Models In One GPU Minute
Vinu Sankar Sadasivan
Shoumik Saha
Gaurang Sriramanan
Priyatham Kattakinda
Atoosa Malemir Chegini
S. Feizi
MIALM
43
34
0
23 Feb 2024
A Conversational Brain-Artificial Intelligence Interface
A Conversational Brain-Artificial Intelligence Interface
Anja Meunier
Michal Robert Zák
Lucas Munz
Sofiya Garkot
Manuel Eder
Jiachen Xu
Moritz Grosse-Wentrup
40
0
0
22 Feb 2024
Coercing LLMs to do and reveal (almost) anything
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping
Alex Stein
Manli Shu
Khalid Saifullah
Yuxin Wen
Tom Goldstein
AAML
48
43
0
21 Feb 2024
Generative AI Security: Challenges and Countermeasures
Generative AI Security: Challenges and Countermeasures
Banghua Zhu
Norman Mu
Jiantao Jiao
David A. Wagner
AAML
SILM
61
8
0
20 Feb 2024
How Susceptible are Large Language Models to Ideological Manipulation?
How Susceptible are Large Language Models to Ideological Manipulation?
Kai Chen
Zihao He
Jun Yan
Taiwei Shi
Kristina Lerman
40
10
0
18 Feb 2024
A Trembling House of Cards? Mapping Adversarial Attacks against Language
  Agents
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Lingbo Mo
Zeyi Liao
Boyuan Zheng
Yu-Chuan Su
Chaowei Xiao
Huan Sun
AAML
LLMAG
49
15
0
15 Feb 2024
PAL: Proxy-Guided Black-Box Attack on Large Language Models
PAL: Proxy-Guided Black-Box Attack on Large Language Models
Chawin Sitawarin
Norman Mu
David A. Wagner
Alexandre Araujo
ELM
29
29
0
15 Feb 2024
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
Zhichen Dong
Zhanhui Zhou
Chao Yang
Jing Shao
Yu Qiao
ELM
52
55
0
14 Feb 2024
Instruction Backdoor Attacks Against Customized LLMs
Instruction Backdoor Attacks Against Customized LLMs
Rui Zhang
Hongwei Li
Rui Wen
Wenbo Jiang
Yuan Zhang
Michael Backes
Yun Shen
Yang Zhang
AAML
SILM
30
24
0
14 Feb 2024
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xing-ming Guo
Fangxu Yu
Huan Zhang
Lianhui Qin
Bin Hu
AAML
117
69
0
13 Feb 2024
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented
  Generation of Large Language Models
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models
Wei Zou
Runpeng Geng
Binghui Wang
Jinyuan Jia
SILM
39
45
1
12 Feb 2024
Differentially Private Training of Mixture of Experts Models
Differentially Private Training of Mixture of Experts Models
Pierre Tholoniat
Huseyin A. Inan
Janardhan Kulkarni
Robert Sim
MoE
41
1
0
11 Feb 2024
StruQ: Defending Against Prompt Injection with Structured Queries
StruQ: Defending Against Prompt Injection with Structured Queries
Sizhe Chen
Julien Piet
Chawin Sitawarin
David A. Wagner
SILM
AAML
30
65
0
09 Feb 2024
Fight Back Against Jailbreaking via Prompt Adversarial Tuning
Fight Back Against Jailbreaking via Prompt Adversarial Tuning
Yichuan Mo
Yuji Wang
Zeming Wei
Yisen Wang
AAML
SILM
49
25
0
09 Feb 2024
Comprehensive Assessment of Jailbreak Attacks Against LLMs
Comprehensive Assessment of Jailbreak Attacks Against LLMs
Junjie Chu
Yugeng Liu
Ziqing Yang
Xinyue Shen
Michael Backes
Yang Zhang
AAML
37
66
0
08 Feb 2024
Trustworthy Distributed AI Systems: Robustness, Privacy, and Governance
Trustworthy Distributed AI Systems: Robustness, Privacy, and Governance
Wenqi Wei
Ling Liu
31
16
0
02 Feb 2024
An Early Categorization of Prompt Injection Attacks on Large Language
  Models
An Early Categorization of Prompt Injection Attacks on Large Language Models
Sippo Rossi
Alisia Marianne Michel
R. Mukkamala
J. Thatcher
SILM
AAML
26
16
0
31 Jan 2024
Security and Privacy Challenges of Large Language Models: A Survey
Security and Privacy Challenges of Large Language Models: A Survey
B. Das
M. H. Amini
Yanzhao Wu
PILM
ELM
19
103
0
30 Jan 2024
Fortifying Ethical Boundaries in AI: Advanced Strategies for Enhancing
  Security in Large Language Models
Fortifying Ethical Boundaries in AI: Advanced Strategies for Enhancing Security in Large Language Models
Yunhong He
Jianling Qiu
Wei Zhang
Zhe Yuan
32
3
0
27 Jan 2024
All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks
All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks
Kazuhiro Takemoto
39
21
0
18 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
57
56
0
11 Jan 2024
Malla: Demystifying Real-world Large Language Model Integrated Malicious
  Services
Malla: Demystifying Real-world Large Language Model Integrated Malicious Services
Zilong Lin
Jian Cui
Xiaojing Liao
Xiaofeng Wang
27
19
0
06 Jan 2024
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Renjie Pi
Tianyang Han
Jianshu Zhang
Yueqi Xie
Rui Pan
Qing Lian
Hanze Dong
Jipeng Zhang
Tong Zhang
AAML
23
59
0
05 Jan 2024
Detection and Defense Against Prominent Attacks on Preconditioned
  LLM-Integrated Virtual Assistants
Detection and Defense Against Prominent Attacks on Preconditioned LLM-Integrated Virtual Assistants
C. Chan
Daniel Wankit Yip
Aysan Esmradi
26
1
0
02 Jan 2024
A Novel Evaluation Framework for Assessing Resilience Against Prompt
  Injection Attacks in Large Language Models
A Novel Evaluation Framework for Assessing Resilience Against Prompt Injection Attacks in Large Language Models
Daniel Wankit Yip
Aysan Esmradi
C. Chan
AAML
28
11
0
02 Jan 2024
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Julien Piet
Maha Alrashed
Chawin Sitawarin
Sizhe Chen
Zeming Wei
Elizabeth Sun
Basel Alomair
David A. Wagner
AAML
SyDa
83
52
0
29 Dec 2023
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the
  Believability of Human Behavior Simulation
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation
Yang Xiao
Yi Cheng
Jinlan Fu
Jiashuo Wang
Wenjie Li
Pengfei Liu
LLMAG
54
4
0
28 Dec 2023
Exploiting Novel GPT-4 APIs
Exploiting Novel GPT-4 APIs
Kellin Pelrine
Mohammad Taufeeque
Michal Zajkac
Euan McLean
Adam Gleave
SILM
23
20
0
21 Dec 2023
Mutual-modality Adversarial Attack with Semantic Perturbation
Mutual-modality Adversarial Attack with Semantic Perturbation
Jingwen Ye
Ruonan Yu
Songhua Liu
Xinchao Wang
AAML
29
9
0
20 Dec 2023
Previous
123456
Next