PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning

11 October 2024

Papers citing "PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning"

2 / 2 papers shown

Title
System Prompt Poisoning: Persistent Attacks on Large Language Models Beyond User Injection Jiawei Guo Haipeng Cai SILM AAML 29 0 0 10 May 2025
To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models Zihao Zhu Hongbao Zhang Mingda Zhang Ruotong Wang Guanzong Wu Ke Xu AAML LRM 59 5 0 16 Feb 2025