Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.04204
Cited By
v1
v2 (latest)
Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence
6 February 2025
Shaopeng Fu
Liang Ding
Di Wang
Di Wang
Author Contacts:
shaopeng.fu@kaust.edu.sa
liangding.liam@gmail.com
jingfeng.zhang@auckland.ac.nz
di.wang@kaust.edu.sa
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence"
2 / 2 papers shown
Title
Adversarially Pretrained Transformers may be Universally Robust In-Context Learners
Soichiro Kumano
Hiroshi Kera
Toshihiko Yamasaki
AAML
117
0
0
20 May 2025
Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements
Shu Yang
Shenzhe Zhu
Zeyu Wu
Keyu Wang
Junchi Yao
Junchao Wu
Lijie Hu
Mengdi Li
Derek F. Wong
Di Wang
53
8
0
18 Feb 2025
1