ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.02080
  4. Cited By
Evolving Security in LLMs: A Study of Jailbreak Attacks and Defenses

Evolving Security in LLMs: A Study of Jailbreak Attacks and Defenses

2 April 2025
Zhengchun Shang
Wenlan Wei
    AAML
ArXiv (abs)PDFHTML

Papers citing "Evolving Security in LLMs: A Study of Jailbreak Attacks and Defenses"

9 / 9 papers shown
Title
The Tower of Babel Revisited: Multilingual Jailbreak Prompts on Closed-Source Large Language Models
The Tower of Babel Revisited: Multilingual Jailbreak Prompts on Closed-Source Large Language Models
Linghan Huang
Haolin Jin
Zhaoge Bi
Pengyue Yang
Peizhou Zhao
Taozhao Chen
Xiongfei Wu
Lei Ma
Huaming Chen
AAML
51
0
0
18 May 2025
A Survey on Large Language Models for Code Generation
A Survey on Large Language Models for Code Generation
Juyong Jiang
Fan Wang
Jiasi Shen
Sungju Kim
Sunghun Kim
122
199
0
01 Jun 2024
Jailbreaking Large Language Models Against Moderation Guardrails via
  Cipher Characters
Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters
Haibo Jin
Andy Zhou
Joe D. Menke
Haohan Wang
95
22
0
30 May 2024
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Hakan Inan
Kartikeya Upasani
Jianfeng Chi
Rashi Rungta
Krithika Iyer
...
Michael Tontchev
Qing Hu
Brian Fuller
Davide Testuggine
Madian Khabsa
AI4MH
165
463
0
07 Dec 2023
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Alexander Robey
Eric Wong
Hamed Hassani
George J. Pappas
AAML
126
257
0
05 Oct 2023
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated
  Jailbreak Prompts
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
219
352
0
19 Sep 2023
Not what you've signed up for: Compromising Real-World LLM-Integrated
  Applications with Indirect Prompt Injection
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Kai Greshake
Sahar Abdelnabi
Shailesh Mishra
C. Endres
Thorsten Holz
Mario Fritz
SILM
153
498
0
23 Feb 2023
Thief, Beware of What Get You There: Towards Understanding Model
  Extraction Attack
Thief, Beware of What Get You There: Towards Understanding Model Extraction Attack
Xinyi Zhang
Chengfang Fang
Jie Shi
MIACVMLAUSILM
83
16
0
13 Apr 2021
Text Summarization with Pretrained Encoders
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
465
1,452
0
22 Aug 2019
1