Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.21700
Cited By
XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
30 April 2025
Marco Arazzi
Vignesh Kumar Kembu
Antonino Nocera
V. P.
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs"
3 / 3 papers shown
Title
Atla Selene Mini: A General Purpose Evaluation Model
Andrei Alexandru
Antonia Calvi
Henry Broomfield
Jackson Golden
Kyle Dai
...
Max Bartolo
Roman Engeler
Sashank Pisupati
Toby Drane
Young Sun Park
ALM
ELM
AILaw
LM&MA
LRM
101
6
0
27 Jan 2025
FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering
Md Rafi Ur Rashid
Vishnu Asutosh Dasu
Kang Gu
Najrin Sultana
Shagufta Mehnaz
AAML
FedML
176
12
0
24 Oct 2023
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
1.2K
22,295
0
22 May 2017
1