Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.16765
Cited By
A Cross-Language Investigation into Jailbreak Attacks in Large Language Models
30 January 2024
Jie Li
Yi Liu
Chongyang Liu
Ling Shi
Xiaoning Ren
Yaowen Zheng
Yang Liu
Yinxing Xue
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Cross-Language Investigation into Jailbreak Attacks in Large Language Models"
9 / 9 papers shown
Title
Attack and defense techniques in large language models: A survey and new perspectives
Zhiyu Liao
Kang Chen
Yuanguo Lin
Kangkang Li
Yunxuan Liu
Hefeng Chen
Xingwang Huang
Yuanhui Yu
AAML
54
0
0
02 May 2025
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu
Yinghui He
Xinzhe Juan
Yishuo Wang
Yong-Jin Liu
Zixin Yao
Yue Wu
Xun Jiang
L. Yang
Mengdi Wang
AI4MH
73
0
0
13 Apr 2025
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models
Delong Ran
Jinyuan Liu
Yichen Gong
Jingyi Zheng
Xinlei He
Tianshuo Cong
Anyu Wang
ELM
47
10
0
13 Jun 2024
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
Xunguang Wang
Daoyuan Wu
Zhenlan Ji
Zongjie Li
Pingchuan Ma
Shuai Wang
Yingjiu Li
Yang Liu
Ning Liu
Juergen Rahmel
AAML
76
8
0
08 Jun 2024
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
147
146
0
16 Oct 2023
PentestGPT: An LLM-empowered Automatic Penetration Testing Tool
Gelei Deng
Yi Liu
Víctor Mayoral-Vilches
Peng Liu
Yuekang Li
Yuan Xu
Tianwei Zhang
Yang Liu
M. Pinzger
Stefan Rass
LLMAG
20
82
0
13 Aug 2023
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
227
502
0
28 Sep 2022
A Survey on Understanding, Visualizations, and Explanation of Deep Neural Networks
Atefeh Shahroudnejad
FaML
AAML
AI4CE
XAI
51
34
0
02 Feb 2021
A Survey on Neural Network Interpretability
Yu Zhang
Peter Tiño
A. Leonardis
K. Tang
FaML
XAI
144
661
0
28 Dec 2020
1