ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.08793
  4. Cited By
JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models
v1v2 (latest)

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models

12 April 2024
Yingchaojie Feng
Zhizhang Chen
Zhining Kang
Sijia Wang
Haoyu Tian
Wei Zhang
Minfeng Zhu
Wei Chen
ArXiv (abs)PDFHTML

Papers citing "JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models"

4 / 4 papers shown
Title
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
Isha Gupta
David Khachaturov
Robert D. Mullins
AAMLAuLLM
115
4
0
02 Feb 2025
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
Zhao Xu
Fan Liu
Hao Liu
AAML
126
16
0
13 Jun 2024
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
Fan Liu
Zhao Xu
Hao Liu
AAML
130
13
0
07 Jun 2024
A Unified Approach to Interpreting Model Predictions
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
1.2K
22,295
0
22 May 2017
1