Security Issues in Language Models

SILM

LLM security is the investigation of the failure modes of LLMs in use, the conditions that lead to them, and their mitigations. The failure modes include the vulnerabilities of LLM to leak sensitive information or inappropriate contents, inclusion of trojan samples on the web such that an LLM is trained on them to eventually show inappropriate or dangerous behaviours at their deployment, or various potential misuse of LLMs to cause harms and pursue illegal activities.

Neighbor communities

51015

Featured Papers

0 / 0 papers shown

Title

All papers

50 / 917 papers shown

Title
AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models Aashray Reddy Andrew Zagula Nicholas Saban AAML MU SILM 12 0 0 04 Nov 2025
The SDSC Satellite Reverse Proxy Service for Launching Secure Jupyter Notebooks on High-Performance Computing Systems Mary P Thomas Martin Kandes James McDougall Dmitry Mishan Scott Sakai Subhashini Sivagnanam Mahidhar Tatineni SILM SyDa 60 0 0 03 Nov 2025
Rescuing the Unpoisoned: Efficient Defense against Knowledge Corruption Attacks on RAG Systems Minseok Kim Hankook Lee Hyungjoon Koo AAML SILM 8 0 0 03 Nov 2025
Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models Daniyal Ganiuly Assel Smaiyl SILM AAML ELM 16 0 0 03 Nov 2025
ToxicTextCLIP: Text-Based Poisoning and Backdoor Attacks on CLIP Pre-training Xin Yao Haiyang Zhao Yimin Chen Jiawei Guo Kecheng Huang Ming Zhao CLIP SILM VLM 8 0 0 01 Nov 2025
DRIP: Defending Prompt Injection via De-instruction Training and Residual Fusion Model Architecture Ruofan Liu Yun Lin Jin Song Dong AAML SILM 8 0 0 01 Nov 2025
Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling Zenghao Niu Weicheng Xie Siyang Song Zitong Yu Feng Liu Linlin Shen AAML SILM 76 0 0 01 Nov 2025
Prevalence of Security and Privacy Risk-Inducing Usage of AI-based Conversational Agents Kathrin Grosse Nico Ebert SILM 64 0 0 31 Oct 2025
Secure Retrieval-Augmented Generation against Poisoning Attacks Zirui Cheng Jikai Sun Anjun Gao Yueyang Quan Zhuqing Liu Xiaohua Hu Minghong Fang SILM AAML 20 0 0 28 Oct 2025
AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts Yufan Liu Wanqian Zhang Huashan Chen Lin Wang Xiaojun Jia Zheng Lin Weiping Wang SILM 104 0 0 28 Oct 2025
S3C2 Summit 2025-03: Industry Secure Supply Chain Summit Elizabeth Lin Jonah Ghebremichael William Enck Yasemin Acar Michel Cukier A. Kapravelos Christian Kastner Laurie A. Williams SILM ELM 77 0 0 28 Oct 2025
Do Chatbots Walk the Talk of Responsible AI? Susan Ariel Aaronson Michael Moreno SILM AI4MH 126 0 0 28 Oct 2025
RefleXGen:The unexamined code is not worth usingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 Bin Wang Hui Li Aofan Liu BoTao Yang Ao Yang Y. Zhong Weixiang Huang Y. Zhang Runhuai Huang Weimin Zeng SILM 60 0 0 27 Oct 2025
Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies Bin Wang Y. Zhong MiDi Wan W. Yu YuanBing Ouyang Y. Huang Hui Li SILM AAML 40 0 0 27 Oct 2025
NeuroGenPoisoning: Neuron-Guided Attacks on Retrieval-Augmented Generation of LLM via Genetic Optimization of External Knowledge Hanyu Zhu Lance Fiondella Jiawei Yuan K. Zeng Long Jiao SILM AAML KELM 88 0 0 24 Oct 2025
The Trojan Example: Jailbreaking LLMs through Template Filling and Unsafety Reasoning Mingrui Liu Sixiao Zhang Cheng Long Kwok Yan Lam SILM 32 0 0 24 Oct 2025
Jailbreak Mimicry: Automated Discovery of Narrative-Based Jailbreaks for Large Language Models Pavlos Ntais AAML SILM 121 0 0 24 Oct 2025
A New Type of Adversarial Examples Xingyang Nie Guojie Xiao Su Pan Biao Wang Huilin Ge Tao Fang AAML SILM 86 0 0 22 Oct 2025
RESCUE: Retrieval Augmented Secure Code Generation Jiahao Shi Tianyi Zhang SILM 69 0 0 21 Oct 2025
CourtGuard: A Local, Multiagent Prompt Injection Classifier Isaac Wu Michael Maslowski LLMAG AAML SILM 74 0 0 20 Oct 2025
The Hidden Cost of Modeling P(X): Vulnerability to Membership Inference Attacks in Generative Text Classifiers Owais Makroo Siva Rajesh Kasa Sumegh Roychowdhury Karan Gupta Nikhil Pattisapu Santhosh Kumar Kasa Sumit Negi SILM 44 0 0 17 Oct 2025
Securing U.S. Critical Infrastructure: Lessons from Stuxnet and the Ukraine Power Grid Attacks Jack Vanlyssel SILM 20 0 0 16 Oct 2025
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers Andrew Zhao Reshmi Ghosh Vitor Carvalho Emily Lawton Keegan Hines Gao Huang Jack W. Stokes AAML SILM 24 0 0 16 Oct 2025
Open Shouldn't Mean Exempt: Open-Source Exceptionalism and Generative AI David Atkinson SILM 28 0 0 16 Oct 2025
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers Avihay Cohen SILM LLMAG AI4CE 64 0 0 15 Oct 2025
PromptLocate: Localizing Prompt Injection Attacks Yuqi Jia Yupei Liu Zedian Shao Jinyuan Jia Neil Zhenqiang Gong SILM AAML 137 2 0 14 Oct 2025
RAG-Pull: Imperceptible Attacks on RAG Systems for Code Generation Vasilije Stambolic Aritra Dhar Lukas Cavigelli AAML SILM 54 0 0 13 Oct 2025
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense Yang Zhuochen Fok Kar Wai Thing Vrizlynn AAML SILM 30 0 0 13 Oct 2025
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity Zaixi Zhang Souradip Chakraborty Amrit Singh Bedi Emilin Mathew Varsha Saravanan ... Eric Xing R. Altman George Church M. Y. Wang Mengdi Wang SILM 113 0 0 13 Oct 2025
Safeguarding Efficacy in Large Language Models: Evaluating Resistance to Human-Written and Algorithmic Adversarial Prompts Tiarnaigh Downey-Webb Olamide Jogunola Oluwaseun Ajao SILM AAML ELM 32 0 0 12 Oct 2025
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model Mohan Zhang Yihua Zhang Jinghan Jia Zhangyang Wang Sijia Liu Tianlong Chen SILM LRM 36 0 0 12 Oct 2025
RIPRAG: Hack a Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning Meng Xi Sihan Lv Yechen Jin Guanjie Cheng Naibo Wang Ying Li Jianwei Yin SILM AAML 86 0 0 11 Oct 2025
Text Prompt Injection of Vision Language Models Ruizhe Zhu SILM VLM 112 0 0 10 Oct 2025
Exploiting Web Search Tools of AI Agents for Data Exfiltration Dennis Rall Bernhard Bauer Mohit Mittal Thomas Fraunholz SILM AAML 108 0 0 10 Oct 2025
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG) Junki Mori Kazuya Kakizaki Taiki Miyagawa Jun Sakuma SILM SyDa 72 0 0 08 Oct 2025
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples Alexandra Souly Javier Rando Ed Chapman Xander Davies Burak Hasircioglu ... Erik Jones Chris Hicks Nicholas Carlini Y. Gal Robert Kirk AAML SILM 40 0 0 08 Oct 2025
RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection Yuxin Wen Arman Zharmagambetov Ivan Evtimov Narine Kokhlikyan Tom Goldstein Kamalika Chaudhuri Chuan Guo OffRL SILM 52 1 0 06 Oct 2025
Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models Anindya Sundar Das Kangjie Chen M. Bhuyan SILM AAML 32 0 0 05 Oct 2025
Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods Yulin Chen Haoran Li Yuan Sui Yangqiu Song Bryan Hooi SILM AAML 103 0 0 04 Oct 2025
External Data Extraction Attacks against Retrieval-Augmented Large Language Models Yu He Y. Chen Y. Li Shuo Shao Leyi Qi Boheng Li Dacheng Tao Zhan Qin AAML SILM 92 0 0 03 Oct 2025
Bypassing Prompt Guards in Production with Controlled-Release Prompting Jaiden Fairoze Sanjam Garg Keewoo Lee Mingyuan Wang SILM AAML 108 0 0 02 Oct 2025
InvThink: Towards AI Safety via Inverse Reasoning Yubin Kim Taehan Kim Lizhou Fan Chunjong Park C. Breazeal Daniel J. McDuff Hae Won Park ReLM SILM MU LRM AI4CE 96 0 0 02 Oct 2025
A Call to Action for a Secure-by-Design Generative AI Paradigm Dalal Alharthi Ivan Roberto Kawaminami Garcia SILM AAML 32 0 0 01 Oct 2025
SecInfer: Preventing Prompt Injection via Inference-time Scaling Yupei Liu Yanting Wang Yuqi Jia Jinyuan Jia Neil Zhenqiang Gong SILM AAML LRM 136 1 0 29 Sep 2025
Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models Xiaotian Zou SILM VPVLM 24 0 0 27 Sep 2025
Privy: Envisioning and Mitigating Privacy Risks for Consumer-facing AI Product Concepts Hao-Ping Lee Yu-Ju Yang Matthew Bilik Isadora Krsek Thomas Serban Von Davier Kyzyl Monteiro Jason Lin Shivani Agarwal Jodi Forlizzi Sauvik Das SILM 20 0 0 27 Sep 2025
ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents Hwan Chang Yonghyun Jun Hwanhee Lee SILM 48 0 0 26 Sep 2025
Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Generation via Backdoor Attacks Gaurav R. Bagwe Saket S. Chaturvedi Xiaolong Ma Xiaoyong Yuan Kuang-Ching Wang Lan Zhang SILM 48 1 0 26 Sep 2025
RAG Security and Privacy: Formalizing the Threat Model and Attack Surface Atousa Arzanipour R. Behnia Reza Ebrahimi Kaushik Dutta SILM 84 0 0 24 Sep 2025
Investigating Security Implications of Automatically Generated Code on the Software Supply Chain Xiaofan Li Xing Gao SILM AAML 48 0 0 24 Sep 2025

Loading #Papers per Month with "SILM"

Past speakers

Name (-)

Top Contributors

Name (-)

Top Organizations at ResearchTrend.AI

Name (-)

Social Events

Date	Location	Event
No social events available