LLM vs. SAST: A Technical Analysis on Detecting Coding Bugs of GPT4-Advanced Data Analysis

18 June 2025

Madjid G. Tehrani

Eldar Sultanow

William J. Buchanan

Mahkame Houmani

Christel H. Djaha Fodja

Author Contacts:

ELM

ArXiv (abs)PDF HTML

Main:10 Pages

Bibliography:6 Pages

3 Tables

Appendix:1 Pages

Abstract

With the rapid advancements in Natural Language Processing (NLP), large language models (LLMs) like GPT-4 have gained significant traction in diverse applications, including security vulnerability scanning. This paper investigates the efficacy of GPT-4 in identifying software vulnerabilities compared to traditional Static Application Security Testing (SAST) tools. Drawing from an array of security mistakes, our analysis underscores the potent capabilities of GPT-4 in LLM-enhanced vulnerability scanning. We unveiled that GPT-4 (Advanced Data Analysis) outperforms SAST by an accuracy of 94% in detecting 32 types of exploitable vulnerabilities. This study also addresses the potential security concerns surrounding LLMs, emphasising the imperative of security by design/default and other security best practices for AI.

View on arXiv

@article{tehrani2025_2506.15212,
  title={ LLM vs. SAST: A Technical Analysis on Detecting Coding Bugs of GPT4-Advanced Data Analysis },
  author={ Madjid G. Tehrani and Eldar Sultanow and William J. Buchanan and Mahkame Houmani and Christel H. Djaha Fodja },
  journal={arXiv preprint arXiv:2506.15212},
  year={ 2025 }
}

Comments on this paper