ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Communities
  3. ...

Neighbor communities

0 / 0 papers shown
Title
Top Contributors
Name# Papers# Citations
Social Events
DateLocationEvent
  1. Home
  2. Communities
  3. SILM

Security Issues in Language Models

SILM
More data

LLM security is the investigation of the failure modes of LLMs in use, the conditions that lead to them, and their mitigations. The failure modes include the vulnerabilities of LLM to leak sensitive information or inappropriate contents, inclusion of trojan samples on the web such that an LLM is trained on them to eventually show inappropriate or dangerous behaviours at their deployment, or various potential misuse of LLMs to cause harms and pursue illegal activities.

Neighbor communities

51015

Featured Papers

0 / 0 papers shown
Title

All papers

50 / 910 papers shown
Title
Prevalence of Security and Privacy Risk-Inducing Usage of AI-based Conversational Agents
Kathrin Grosse
Nico Ebert
SILM
48
0
0
03 Nov 2025
Secure Retrieval-Augmented Generation against Poisoning Attacks
Secure Retrieval-Augmented Generation against Poisoning Attacks
Zirui Cheng
Jikai Sun
Anjun Gao
Yueyang Quan
Zhuqing Liu
Xiaohua Hu
Minghong Fang
SILMAAML
4
0
0
28 Oct 2025
S3C2 Summit 2025-03: Industry Secure Supply Chain Summit
S3C2 Summit 2025-03: Industry Secure Supply Chain Summit
Elizabeth Lin
Jonah Ghebremichael
William Enck
Yasemin Acar
Michel Cukier
Alexandros Kapravelos
Christian Kastner
Laurie Williams
SILMELM
61
0
0
28 Oct 2025
Do Chatbots Walk the Talk of Responsible AI?
Do Chatbots Walk the Talk of Responsible AI?
Susan Ariel Aaronson
Michael Moreno
SILMAI4MH
90
0
0
28 Oct 2025
AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
Yufan Liu
Wanqian Zhang
Huashan Chen
Lin Wang
Xiaojun Jia
Zheng Lin
Weiping Wang
SILM
76
0
0
28 Oct 2025
Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies
Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies
Bin Wang
Y. Zhong
MiDi Wan
W. Yu
YuanBing Ouyang
Y. Huang
Hui Li
SILMAAML
28
0
0
27 Oct 2025
RefleXGen:The unexamined code is not worth using
RefleXGen:The unexamined code is not worth usingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Bin Wang
Hui Li
Aofan Liu
BoTao Yang
Ao Yang
Y. Zhong
Weixiang Huang
Y. Zhang
Runhuai Huang
Weimin Zeng
SILM
48
0
0
27 Oct 2025
NeuroGenPoisoning: Neuron-Guided Attacks on Retrieval-Augmented Generation of LLM via Genetic Optimization of External Knowledge
NeuroGenPoisoning: Neuron-Guided Attacks on Retrieval-Augmented Generation of LLM via Genetic Optimization of External Knowledge
Hanyu Zhu
Lance Fiondella
Jiawei Yuan
K. Zeng
Long Jiao
SILMAAMLKELM
76
0
0
24 Oct 2025
The Trojan Example: Jailbreaking LLMs through Template Filling and Unsafety Reasoning
The Trojan Example: Jailbreaking LLMs through Template Filling and Unsafety Reasoning
Mingrui Liu
Sixiao Zhang
Cheng Long
Kwok Yan Lam
SILM
32
0
0
24 Oct 2025
Jailbreak Mimicry: Automated Discovery of Narrative-Based Jailbreaks for Large Language Models
Jailbreak Mimicry: Automated Discovery of Narrative-Based Jailbreaks for Large Language Models
Pavlos Ntais
AAMLSILM
109
0
0
24 Oct 2025
A New Type of Adversarial Examples
A New Type of Adversarial Examples
Xingyang Nie
Guojie Xiao
Su Pan
Biao Wang
Huilin Ge
Tao Fang
AAMLSILM
66
0
0
22 Oct 2025
RESCUE: Retrieval Augmented Secure Code Generation
RESCUE: Retrieval Augmented Secure Code Generation
Jiahao Shi
Tianyi Zhang
SILM
49
0
0
21 Oct 2025
CourtGuard: A Local, Multiagent Prompt Injection Classifier
CourtGuard: A Local, Multiagent Prompt Injection Classifier
Isaac Wu
Michael Maslowski
LLMAGAAMLSILM
66
0
0
20 Oct 2025
The Hidden Cost of Modeling P(X): Vulnerability to Membership Inference Attacks in Generative Text Classifiers
The Hidden Cost of Modeling P(X): Vulnerability to Membership Inference Attacks in Generative Text Classifiers
Owais Makroo
Siva Rajesh Kasa
Sumegh Roychowdhury
Karan Gupta
Nikhil Pattisapu
Santhosh Kumar Kasa
Sumit Negi
SILM
20
0
0
17 Oct 2025
Open Shouldn't Mean Exempt: Open-Source Exceptionalism and Generative AI
Open Shouldn't Mean Exempt: Open-Source Exceptionalism and Generative AI
David Atkinson
SILM
20
0
0
16 Oct 2025
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
Andrew Zhao
Reshmi Ghosh
Vitor Carvalho
Emily Lawton
Keegan Hines
Gao Huang
Jack W. Stokes
AAMLSILM
8
0
0
16 Oct 2025
Securing U.S. Critical Infrastructure: Lessons from Stuxnet and the Ukraine Power Grid Attacks
Securing U.S. Critical Infrastructure: Lessons from Stuxnet and the Ukraine Power Grid Attacks
Jack Vanlyssel
SILM
8
0
0
16 Oct 2025
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
Avihay Cohen
SILMLLMAGAI4CE
48
0
0
15 Oct 2025
PromptLocate: Localizing Prompt Injection Attacks
PromptLocate: Localizing Prompt Injection Attacks
Yuqi Jia
Yupei Liu
Zedian Shao
Jinyuan Jia
Neil Zhenqiang Gong
SILMAAML
117
2
0
14 Oct 2025
RAG-Pull: Imperceptible Attacks on RAG Systems for Code Generation
RAG-Pull: Imperceptible Attacks on RAG Systems for Code Generation
Vasilije Stambolic
Aritra Dhar
Lukas Cavigelli
AAMLSILM
36
0
0
13 Oct 2025
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity
Zaixi Zhang
Souradip Chakraborty
Amrit Singh Bedi
Emilin Mathew
Varsha Saravanan
...
Jian Ma
Eric Xing
R. Altman
George Church
M. Y. Wang
SILM
57
0
0
13 Oct 2025
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
Yang Zhuochen
Fok Kar Wai
Thing Vrizlynn
AAMLSILM
4
0
0
13 Oct 2025
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
Mohan Zhang
Yihua Zhang
Jinghan Jia
Zhangyang Wang
Sijia Liu
Tianlong Chen
SILMLRM
28
0
0
12 Oct 2025
Safeguarding Efficacy in Large Language Models: Evaluating Resistance to Human-Written and Algorithmic Adversarial Prompts
Safeguarding Efficacy in Large Language Models: Evaluating Resistance to Human-Written and Algorithmic Adversarial Prompts
Tiarnaigh Downey-Webb
Olamide Jogunola
Oluwaseun Ajao
SILMAAMLELM
20
0
0
12 Oct 2025
RIPRAG: Hack a Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning
RIPRAG: Hack a Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning
Meng Xi
Sihan Lv
Yechen Jin
Guanjie Cheng
Naibo Wang
Ying Li
Jianwei Yin
SILMAAML
62
0
0
11 Oct 2025
Exploiting Web Search Tools of AI Agents for Data Exfiltration
Exploiting Web Search Tools of AI Agents for Data Exfiltration
Dennis Rall
Bernhard Bauer
Mohit Mittal
Thomas Fraunholz
SILMAAML
96
0
0
10 Oct 2025
Text Prompt Injection of Vision Language Models
Text Prompt Injection of Vision Language Models
Ruizhe Zhu
SILMVLM
104
0
0
10 Oct 2025
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)
Junki Mori
Kazuya Kakizaki
Taiki Miyagawa
Jun Sakuma
SILMSyDa
60
0
0
08 Oct 2025
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Alexandra Souly
Javier Rando
Ed Chapman
Xander Davies
Burak Hasircioglu
...
Erik Jones
Chris Hicks
Nicholas Carlini
Y. Gal
Robert Kirk
AAMLSILM
32
0
0
08 Oct 2025
RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection
RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection
Yuxin Wen
Arman Zharmagambetov
Ivan Evtimov
Narine Kokhlikyan
Tom Goldstein
Kamalika Chaudhuri
Chuan Guo
OffRLSILM
48
1
0
06 Oct 2025
Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
Anindya Sundar Das
Kangjie Chen
M. Bhuyan
SILMAAML
24
0
0
05 Oct 2025
Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
Yulin Chen
Haoran Li
Yuan Sui
Yangqiu Song
Bryan Hooi
SILMAAML
83
0
0
04 Oct 2025
External Data Extraction Attacks against Retrieval-Augmented Large Language Models
External Data Extraction Attacks against Retrieval-Augmented Large Language Models
Yu He
Y. Chen
Y. Li
Shuo Shao
Leyi Qi
Boheng Li
Dacheng Tao
Zhan Qin
AAMLSILM
80
0
0
03 Oct 2025
InvThink: Towards AI Safety via Inverse Reasoning
InvThink: Towards AI Safety via Inverse Reasoning
Yubin Kim
Taehan Kim
Eugene Park
Chunjong Park
C. Breazeal
Daniel J. McDuff
Hae Won Park
ReLMSILMMULRMAI4CE
92
0
0
02 Oct 2025
Bypassing Prompt Guards in Production with Controlled-Release Prompting
Bypassing Prompt Guards in Production with Controlled-Release Prompting
Jaiden Fairoze
Sanjam Garg
Keewoo Lee
Mingyuan Wang
SILMAAML
104
0
0
02 Oct 2025
A Call to Action for a Secure-by-Design Generative AI Paradigm
A Call to Action for a Secure-by-Design Generative AI Paradigm
Dalal Alharthi
Ivan Roberto Kawaminami Garcia
SILMAAML
24
0
0
01 Oct 2025
SecInfer: Preventing Prompt Injection via Inference-time Scaling
SecInfer: Preventing Prompt Injection via Inference-time Scaling
Yupei Liu
Yanting Wang
Yuqi Jia
Jinyuan Jia
Neil Zhenqiang Gong
SILMAAMLLRM
104
1
0
29 Sep 2025
Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models
Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models
Xiaotian Zou
SILMVPVLM
8
0
0
27 Sep 2025
Privy: Envisioning and Mitigating Privacy Risks for Consumer-facing AI Product Concepts
Privy: Envisioning and Mitigating Privacy Risks for Consumer-facing AI Product Concepts
Hao-Ping Lee
Yu-Ju Yang
Matthew Bilik
Isadora Krsek
Thomas Serban Von Davier
Kyzyl Monteiro
Jason Lin
Shivani Agarwal
Jodi Forlizzi
Sauvik Das
SILM
4
0
0
27 Sep 2025
ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents
ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents
Hwan Chang
Yonghyun Jun
Hwanhee Lee
SILM
40
0
0
26 Sep 2025
Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Generation via Backdoor Attacks
Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Generation via Backdoor Attacks
Gaurav R. Bagwe
Saket S. Chaturvedi
Xiaolong Ma
Xiaoyong Yuan
Kuang-Ching Wang
Lan Zhang
SILM
40
1
0
26 Sep 2025
Investigating Security Implications of Automatically Generated Code on the Software Supply Chain
Investigating Security Implications of Automatically Generated Code on the Software Supply Chain
Xiaofan Li
Xing Gao
SILMAAML
40
0
0
24 Sep 2025
RAG Security and Privacy: Formalizing the Threat Model and Attack Surface
RAG Security and Privacy: Formalizing the Threat Model and Attack Surface
Atousa Arzanipour
R. Behnia
Reza Ebrahimi
Kaushik Dutta
SILM
64
0
0
24 Sep 2025
Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan
Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan
Yu-Kai Shih
You-Kai Kang
SILM
31
0
0
22 Sep 2025
Enterprise AI Must Enforce Participant-Aware Access Control
Enterprise AI Must Enforce Participant-Aware Access Control
Shashank Shreedhar Bhatt
Tanmay Rajore
Khushboo Aggarwal
Ganesh Ananthanarayanan
Ranveer Chandra
...
Emre Kiciman
Sumit Kumar Pandey
Srinath T. V. Setty
Rahul Sharma
Teijia Zhao
AAMLSILM
49
1
0
18 Sep 2025
AIP: Subverting Retrieval-Augmented Generation via Adversarial Instructional Prompt
AIP: Subverting Retrieval-Augmented Generation via Adversarial Instructional Prompt
Saket S. Chaturvedi
Gaurav R. Bagwe
Lan Zhang
Xiaoyong Yuan
SILMAAML
28
0
0
18 Sep 2025
Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation
Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation
Baolei Zhang
Haoran Xin
Yuxi Chen
Zhuqing Liu
Biao Yi
Tong Li
Lihai Nie
Zheli Liu
Minghong Fang
SILM
113
0
0
17 Sep 2025
A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks
A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks
S M Asif Hossain
Ruksat Khan Shayoni
Mohd Ruhul Ameen
Akif Islam
M. F. Mridha
Jungpil Shin
LLMAGSILMAAML
84
0
0
16 Sep 2025
Beyond PII: How Users Attempt to Estimate and Mitigate Implicit LLM Inference
Beyond PII: How Users Attempt to Estimate and Mitigate Implicit LLM Inference
Synthia Wang
Sai Teja Peddinti
Nina Taft
Nick Feamster
SILMPILM
16
1
0
15 Sep 2025
Early Approaches to Adversarial Fine-Tuning for Prompt Injection Defense: A 2022 Study of GPT-3 and Contemporary Models
Early Approaches to Adversarial Fine-Tuning for Prompt Injection Defense: A 2022 Study of GPT-3 and Contemporary Models
Gustavo Sandoval
Denys Fenchenko
Junyao Chen
AAMLSILM
32
0
0
15 Sep 2025
Loading #Papers per Month with "SILM"
Past speakers
Name (-)
Top Contributors
Name (-)
Top Organizations at ResearchTrend.AI
Name (-)
Social Events
DateLocationEvent
No social events available