
One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
Papers citing "One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models"
34 / 34 papers shown
Title |
---|
![]() SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner Xunguang Wang Daoyuan Wu Zhenlan Ji Zongjie Li Pingchuan Ma Shuai Wang Yingjiu Li Yang Liu Ning Liu Juergen Rahmel |
![]() Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study Yi Liu Gelei Deng Zhengzi Xu Yuekang Li Yaowen Zheng Ying Zhang Lida Zhao Tianwei Zhang Kailong Wang Yang Liu |