ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.12575
  4. Cited By
LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities
  (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks
v1v2 (latest)

LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks

19 December 2023
Saad Ullah
Mingji Han
Saurabh Pujar
Hammond Pearce
Ayse K. Coskun
Gianluca Stringhini
    ELMLRM
ArXiv (abs)PDFHTML

Papers citing "LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks"

20 / 20 papers shown
Title
Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach
Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach
Penghui Li
Songchen Yao
Josef Sarfati Korich
Changhua Luo
Jianjia Yu
Yinzhi Cao
Junfeng Yang
443
0
0
22 Apr 2025
XOXO: Stealthy Cross-Origin Context Poisoning Attacks against AI Coding Assistants
XOXO: Stealthy Cross-Origin Context Poisoning Attacks against AI Coding Assistants
Adam Štorek
Mukur Gupta
Noopur Bhatt
Aditya Gupta
Janie Kim
Prashast Srivastava
Suman Jana
AAML
152
0
0
18 Mar 2025
Do LLMs Consider Security? An Empirical Study on Responses to Programming Questions
Do LLMs Consider Security? An Empirical Study on Responses to Programming Questions
Amirali Sajadi
Binh Le
A. Nguyen
Kostadin Damevski
Preetha Chatterjee
89
3
0
20 Feb 2025
LAMD: Context-driven Android Malware Detection and Classification with LLMs
LAMD: Context-driven Android Malware Detection and Classification with LLMs
Xingzhi Qian
Xinran Zheng
Yiling He
Shuo Yang
Lorenzo Cavallaro
124
3
0
18 Feb 2025
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs
Reza Fayyazi
Stella Hoyos Trueba
Michael Zuzak
S. Yang
65
0
0
22 Oct 2024
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection
Niklas Risse
Marcel Bohme
Marcel Böhme
86
7
0
23 Aug 2024
What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering
What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering
Federico Errica
G. Siracusano
D. Sanvito
Roberto Bifulco
151
25
0
18 Jun 2024
Large Language Models for Cyber Security: A Systematic Literature Review
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
109
40
0
08 May 2024
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
Yuqiang Sun
Daoyuan Wu
Yue Xue
Han Liu
Wei Ma
Lyuye Zhang
Miaolei Shi
Yingjiu Li
ELM
125
54
0
29 Jan 2024
Large Language Models Understand and Can be Enhanced by Emotional
  Stimuli
Large Language Models Understand and Can be Enhanced by Emotional Stimuli
Cheng-rong Li
Jindong Wang
Yixuan Zhang
Kaijie Zhu
Wenxin Hou
Jianxun Lian
Fang Luo
Qiang Yang
Xingxu Xie
LRM
130
133
0
14 Jul 2023
Uncovering the Limits of Machine Learning for Automatic Vulnerability
  Detection
Uncovering the Limits of Machine Learning for Automatic Vulnerability Detection
Niklas Risse
Marcel Bohme
AAML
119
30
0
28 Jun 2023
Faithful Reasoning Using Large Language Models
Faithful Reasoning Using Large Language Models
Antonia Creswell
Murray Shanahan
ReLMLRM
64
125
0
30 Aug 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELMReLMLRM
283
2,480
0
15 Jun 2022
VulBERTa: Simplified Source Code Pre-Training for Vulnerability
  Detection
VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection
Hazim Hanif
S. Maffeis
110
111
0
25 May 2022
Transformer-Based Language Models for Software Vulnerability Detection
Transformer-Based Language Models for Software Vulnerability Detection
Chandra Thapa
Seung Ick Jang
Muhammad Ejaz Ahmed
S. Çamtepe
J. Pieprzyk
Surya Nepal
79
97
0
07 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
817
9,576
0
28 Jan 2022
Show Your Work: Scratchpads for Intermediate Computation with Language
  Models
Show Your Work: Scratchpads for Intermediate Computation with Language Models
Maxwell Nye
Anders Andreassen
Guy Gur-Ari
Henryk Michalewski
Jacob Austin
...
Aitor Lewkowycz
Maarten Bosma
D. Luan
Charles Sutton
Augustus Odena
ReLMLRM
181
746
0
30 Nov 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
233
5,539
0
07 Jul 2021
Unified Pre-training for Program Understanding and Generation
Unified Pre-training for Program Understanding and Generation
Wasi Uddin Ahmad
Saikat Chakraborty
Baishakhi Ray
Kai-Wei Chang
135
766
0
10 Mar 2021
An Observational Investigation of Reverse Engineers' Processes
An Observational Investigation of Reverse Engineers' Processes
Daniel Votipka
Seth M. Rabin
Kristopher K. Micinski
J. Foster
Michelle L. Mazurek
39
76
0
01 Dec 2019
1