ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09527
  4. Cited By
Ignore Previous Prompt: Attack Techniques For Language Models

Ignore Previous Prompt: Attack Techniques For Language Models

17 November 2022
Fábio Perez
Ian Ribeiro
    SILM
ArXivPDFHTML

Papers citing "Ignore Previous Prompt: Attack Techniques For Language Models"

35 / 285 papers shown
Title
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
Fahad Shahbaz Khan
VLM
38
118
0
25 Jul 2023
LLM Censorship: A Machine Learning Challenge or a Computer Security
  Problem?
LLM Censorship: A Machine Learning Challenge or a Computer Security Problem?
David Glukhov
Ilia Shumailov
Y. Gal
Nicolas Papernot
Vardan Papyan
AAML
ELM
28
57
0
20 Jul 2023
Overthinking the Truth: Understanding how Language Models Process False
  Demonstrations
Overthinking the Truth: Understanding how Language Models Process False Demonstrations
Danny Halawi
Jean-Stanislas Denain
Jacob Steinhardt
28
53
0
18 Jul 2023
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output
  Robustness of Large Language Models
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models
Huachuan Qiu
Shuai Zhang
Anqi Li
Hongliang He
Zhenzhong Lan
ALM
42
48
0
17 Jul 2023
MasterKey: Automated Jailbreak Across Multiple Large Language Model
  Chatbots
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots
Gelei Deng
Yi Liu
Yuekang Li
Kailong Wang
Ying Zhang
Zefeng Li
Haoyu Wang
Tianwei Zhang
Yang Liu
SILM
37
118
0
16 Jul 2023
Effective Prompt Extraction from Language Models
Effective Prompt Extraction from Language Models
Yiming Zhang
Nicholas Carlini
Daphne Ippolito
MIACV
SILM
38
36
0
13 Jul 2023
Evaluating GPT-3.5 and GPT-4 on Grammatical Error Correction for
  Brazilian Portuguese
Evaluating GPT-3.5 and GPT-4 on Grammatical Error Correction for Brazilian Portuguese
Maria Carolina Penteado
Fábio Perez
15
7
0
27 Jun 2023
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
Christopher T. Small
Ivan Vendrov
Esin Durmus
Hadjar Homaei
Elizabeth Barry
Julien Cornebise
Ted Suzman
Deep Ganguli
Colin Megill
29
26
0
20 Jun 2023
Safeguarding Crowdsourcing Surveys from ChatGPT with Prompt Injection
Safeguarding Crowdsourcing Surveys from ChatGPT with Prompt Injection
Chaofan Wang
Samuel Kernan Freire
Mo Zhang
Jing Wei
Jorge Goncalves
V. Kostakos
Zhanna Sarsenbayeva
Christina Schneegass
A. Bozzon
E. Niforatos
SILM
16
11
0
15 Jun 2023
Protect Your Prompts: Protocols for IP Protection in LLM Applications
Protect Your Prompts: Protocols for IP Protection in LLM Applications
M. V. Wyk
M. Bekker
X. L. Richards
K. Nixon
SILM
28
2
0
09 Jun 2023
Prompt Injection attack against LLM-integrated Applications
Prompt Injection attack against LLM-integrated Applications
Yi Liu
Gelei Deng
Yuekang Li
Kailong Wang
Zihao Wang
...
Tianwei Zhang
Yepang Liu
Haoyu Wang
Yanhong Zheng
Yang Liu
SILM
41
317
0
08 Jun 2023
Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT
  and GPT-4 for Mining Insights at Scale
Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT and GPT-4 for Mining Insights at Scale
Jonas Oppenlaender
Joonas Hamalainen
25
6
0
08 Jun 2023
PromptRobust: Towards Evaluating the Robustness of Large Language Models
  on Adversarial Prompts
PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
Kaijie Zhu
Jindong Wang
Jiaheng Zhou
Zichen Wang
Hao Chen
...
Linyi Yang
Weirong Ye
Yue Zhang
Neil Zhenqiang Gong
Xingxu Xie
SILM
39
144
0
07 Jun 2023
Seeing Seeds Beyond Weeds: Green Teaming Generative AI for Beneficial
  Uses
Seeing Seeds Beyond Weeds: Green Teaming Generative AI for Beneficial Uses
Logan Stapleton
Jordan Taylor
Sarah E Fox
Tongshuang Wu
Haiyi Zhu
28
13
0
30 May 2023
A Survey on ChatGPT: AI-Generated Contents, Challenges, and Solutions
A Survey on ChatGPT: AI-Generated Contents, Challenges, and Solutions
Yuntao Wang
Yanghe Pan
Miao Yan
Zhou Su
Tom H. Luan
27
146
0
25 May 2023
Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting
  Jailbreaks
Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks
Abhinav Rao
S. Vashistha
Atharva Naik
Somak Aditya
Monojit Choudhury
35
17
0
24 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
45
82
0
19 May 2023
Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large
  Language Models
Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large Language Models
Evan King
Haoxiang Yu
Sangsu Lee
Christine Julien
LM&Ro
26
15
0
16 May 2023
Prompt Engineering for Healthcare: Methodologies and Applications
Prompt Engineering for Healthcare: Methodologies and Applications
Jiaqi Wang
Enze Shi
Sigang Yu
Zihao Wu
Chong Ma
...
Dajiang Zhu
Yixuan Yuan
Dinggang Shen
Tianming Liu
Shu Zhang
LM&MA
44
111
0
28 Apr 2023
Inducing anxiety in large language models increases exploration and bias
Inducing anxiety in large language models increases exploration and bias
Julian Coda-Forno
Kristin Witte
Akshay K. Jagadish
Marcel Binz
Zeynep Akata
Eric Schulz
AI4CE
38
2
0
21 Apr 2023
Safety Assessment of Chinese Large Language Models
Safety Assessment of Chinese Large Language Models
Hao Sun
Zhexin Zhang
Jiawen Deng
Jiale Cheng
Minlie Huang
ALM
ELM
32
75
0
20 Apr 2023
Creating Large Language Model Resistant Exams: Guidelines and Strategies
Creating Large Language Model Resistant Exams: Guidelines and Strategies
Simon Larsén
ELM
22
4
0
18 Apr 2023
In ChatGPT We Trust? Measuring and Characterizing the Reliability of
  ChatGPT
In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
Xinyue Shen
Zhenpeng Chen
Michael Backes
Yang Zhang
24
55
0
18 Apr 2023
Tool Learning with Foundation Models
Tool Learning with Foundation Models
Yujia Qin
Shengding Hu
Yankai Lin
Weize Chen
Ning Ding
...
Cheng Yang
Tongshuang Wu
Heng Ji
Zhiyuan Liu
Maosong Sun
42
200
0
17 Apr 2023
Multi-step Jailbreaking Privacy Attacks on ChatGPT
Multi-step Jailbreaking Privacy Attacks on ChatGPT
Haoran Li
Dadi Guo
Wei Fan
Mingshi Xu
Jie Huang
Fanpu Meng
Yangqiu Song
SILM
47
321
0
11 Apr 2023
Inspecting and Editing Knowledge Representations in Language Models
Inspecting and Editing Knowledge Representations in Language Models
Evan Hernandez
Belinda Z. Li
Jacob Andreas
KELM
21
76
0
03 Apr 2023
Eliciting Latent Predictions from Transformers with the Tuned Lens
Eliciting Latent Predictions from Transformers with the Tuned Lens
Nora Belrose
Zach Furman
Logan Smith
Danny Halawi
Igor V. Ostrovsky
Lev McKinney
Stella Biderman
Jacob Steinhardt
22
193
0
14 Mar 2023
Not what you've signed up for: Compromising Real-World LLM-Integrated
  Applications with Indirect Prompt Injection
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Kai Greshake
Sahar Abdelnabi
Shailesh Mishra
C. Endres
Thorsten Holz
Mario Fritz
SILM
49
436
0
23 Feb 2023
Attacks in Adversarial Machine Learning: A Systematic Survey from the
  Life-cycle Perspective
Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-cycle Perspective
Baoyuan Wu
Zihao Zhu
Li Liu
Qingshan Liu
Zhaofeng He
Siwei Lyu
AAML
44
21
0
19 Feb 2023
Towards Safer Generative Language Models: A Survey on Safety Risks,
  Evaluations, and Improvements
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements
Jiawen Deng
Jiale Cheng
Hao Sun
Zhexin Zhang
Minlie Huang
LM&MA
ELM
34
16
0
18 Feb 2023
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard
  Security Attacks
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
Daniel Kang
Xuechen Li
Ion Stoica
Carlos Guestrin
Matei A. Zaharia
Tatsunori Hashimoto
AAML
24
234
0
11 Feb 2023
Black Box Adversarial Prompting for Foundation Models
Black Box Adversarial Prompting for Foundation Models
Natalie Maus
Patrick Chao
Eric Wong
Jacob R. Gardner
VLM
28
56
0
08 Feb 2023
A Case Report On The "A.I. Locked-In Problem": social concerns with
  modern NLP
A Case Report On The "A.I. Locked-In Problem": social concerns with modern NLP
Yoshija Walter
LLMAG
13
2
0
22 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
333
11,953
0
04 Mar 2022
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
Previous
123456