ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.01210
  4. Cited By
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of
  Large Language Models for Code Generation

Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

2 May 2023
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
    ELM
    ALM
ArXivPDFHTML

Papers citing "Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation"

38 / 138 papers shown
Title
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
Joseph Spracklen
Raveen Wijewickrama
A. H. M. N. Sakib
Anindya Maiti
Murtuza Jadliwala
Murtuza Jadliwala
45
10
0
12 Jun 2024
Validating LLM-Generated Programs with Metamorphic Prompt Testing
Validating LLM-Generated Programs with Metamorphic Prompt Testing
Xiaoyin Wang
Dakai Zhu
38
3
0
11 Jun 2024
Transforming Wearable Data into Health Insights using Large Language
  Model Agents
Transforming Wearable Data into Health Insights using Large Language Model Agents
Mike A. Merrill
Akshay Paruchuri
Naghmeh Rezaei
Geza Kovacs
Javier Perez
...
Shwetak Patel
Jiening Zhan
Tim Althoff
Daniel J. McDuff
Xin Liu
LM&MA
LLMAG
AI4CE
54
8
0
10 Jun 2024
A Survey Study on the State of the Art of Programming Exercise
  Generation using Large Language Models
A Survey Study on the State of the Art of Programming Exercise Generation using Large Language Models
Eduard Frankford
Ingo Höhn
Clemens Sauerwein
Ruth Breu
ELM
39
2
0
30 May 2024
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs
Zichao Hu
Junyi Jessy Li
Arjun Guha
Joydeep Biswas
SyDa
ALM
51
1
0
30 May 2024
$\textit{Trans-LoRA}$: towards data-free Transferable Parameter
  Efficient Finetuning
Trans-LoRA\textit{Trans-LoRA}Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning
Runqian Wang
Soumya Ghosh
David D. Cox
Diego Antognini
Aude Oliva
Rogerio Feris
Leonid Karlinsky
37
1
0
27 May 2024
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off
  Code Generation
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Aojun Zhou
Junting Pan
Hongsheng Li
SyDa
42
7
0
27 May 2024
Granite Code Models: A Family of Open Foundation Models for Code
  Intelligence
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Mayank Mishra
Matt Stallone
Gaoyuan Zhang
Yikang Shen
Aditya Prasad
...
Amith Singhee
Nirmit Desai
David D. Cox
Ruchir Puri
Rameswar Panda
AI4TS
56
55
0
07 May 2024
When LLMs Meet Cybersecurity: A Systematic Literature Review
When LLMs Meet Cybersecurity: A Systematic Literature Review
Jie Zhang
Haoyu Bu
Hui Wen
Yu Chen
Lun Li
Hongsong Zhu
42
36
0
06 May 2024
A Survey on Self-Evolution of Large Language Models
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
57
22
0
22 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path
  Forward
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
53
6
0
12 Apr 2024
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
Jiawei Guo
Ziming Li
Xueling Liu
Kaijing Ma
Tianyu Zheng
...
Xingwei Qu
Xiang Yue
Ge Zhang
Wenhu Chen
Jie Fu
KELM
59
12
0
04 Apr 2024
CSEPrompts: A Benchmark of Introductory Computer Science Prompts
CSEPrompts: A Benchmark of Introductory Computer Science Prompts
Md. Nishat Raihan
Dhiman Goswami
Sadiya Sayara Chowdhury Puspo
Christian D. Newman
Tharindu Ranasinghe
Marcos Zampieri
ELM
44
2
0
03 Apr 2024
Large Language Models for Blockchain Security: A Systematic Literature Review
Large Language Models for Blockchain Security: A Systematic Literature Review
Zheyuan He
Zihao Li
Sen Yang
Ao Qiao
Xiaosong Zhang
Xiapu Luo
Ting Chen
Ting Chen
PILM
42
14
0
21 Mar 2024
FlowMind: Automatic Workflow Generation with LLMs
FlowMind: Automatic Workflow Generation with LLMs
Zhen Zeng
William Watson
Nicole Cho
Saba Rahimi
Shayleen Reynolds
T. Balch
Manuela Veloso
39
26
0
17 Mar 2024
Bugs in Large Language Models Generated Code: An Empirical Study
Bugs in Large Language Models Generated Code: An Empirical Study
Florian Tambon
Arghavan Moradi Dakhel
Amin Nikanjam
Foutse Khomh
Michel C. Desmarais
G. Antoniol
ELM
39
33
0
13 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
52
8
0
29 Feb 2024
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
Zhen Guo
Adriana Meza Soria
Wei Sun
Yikang Shen
Rameswar Panda
ELM
ALM
55
1
0
14 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
  Closed-Source LLMs
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
21
156
0
06 Feb 2024
Learning Agent-based Modeling with LLM Companions: Experiences of
  Novices and Experts Using ChatGPT & NetLogo Chat
Learning Agent-based Modeling with LLM Companions: Experiences of Novices and Experts Using ChatGPT & NetLogo Chat
John Chen
Xi Lu
Michael Rejtig
Yuzhou Du
Ruth Bagley
Mike Horn
Uri Wilensky
28
29
0
30 Jan 2024
(Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection
(Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection
Francesco Periti
Haim Dubossarsky
Nina Tahmasebi
AI4MH
34
13
0
25 Jan 2024
LangProp: A code optimization framework using Large Language Models
  applied to driving
LangProp: A code optimization framework using Large Language Models applied to driving
Shu Ishida
Gianluca Corrado
George Fedoseev
Hudson Yeo
Lloyd Russell
Jamie Shotton
João F. Henriques
Anthony Hu
59
11
0
18 Jan 2024
Xpert: Empowering Incident Management with Query Recommendations via
  Large Language Models
Xpert: Empowering Incident Management with Query Recommendations via Large Language Models
Yuxuan Jiang
Chaoyun Zhang
Shilin He
Zhihao Yang
Ming-Jie Ma
...
Yu Kang
Yingnong Dang
Saravan Rajmohan
Qingwei Lin
Dongmei Zhang
45
17
0
19 Dec 2023
Transfer Attacks and Defenses for Large Language Models on Coding Tasks
Transfer Attacks and Defenses for Large Language Models on Coding Tasks
Chi Zhang
Zifan Wang
Ravi Mangal
Matt Fredrikson
Limin Jia
Corina S. Pasareanu
AAML
SILM
27
1
0
22 Nov 2023
AI-native Interconnect Framework for Integration of Large Language Model
  Technologies in 6G Systems
AI-native Interconnect Framework for Integration of Large Language Model Technologies in 6G Systems
Sasu Tarkoma
Roberto Morabito
Jaakko Sauvola
23
19
0
10 Nov 2023
Forgetful Large Language Models: Lessons Learned from Using LLMs in
  Robot Programming
Forgetful Large Language Models: Lessons Learned from Using LLMs in Robot Programming
Juo-Tung Chen
Chien-Ming Huang
LLMAG
19
12
0
10 Oct 2023
Demystifying RCE Vulnerabilities in LLM-Integrated Apps
Demystifying RCE Vulnerabilities in LLM-Integrated Apps
Tong Liu
Zizhuang Deng
Guozhu Meng
Yuekang Li
Kai Chen
SILM
44
19
0
06 Sep 2023
Bias Testing and Mitigation in LLM-based Code Generation
Bias Testing and Mitigation in LLM-based Code Generation
Dong Huang
Qingwen Bu
Jie M. Zhang
Xiaofei Xie
Junjie Chen
Heming Cui
45
20
0
03 Sep 2023
Watermarking Conditional Text Generation for AI Detection: Unveiling
  Challenges and a Semantic-Aware Watermark Remedy
Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy
Yu Fu
Deyi Xiong
Yue Dong
WaLM
47
30
0
25 Jul 2023
A New Era in Software Security: Towards Self-Healing Software via Large
  Language Models and Formal Verification
A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification
Norbert Tihanyi
Ridhi Jain
Yiannis Charalambous
M. Ferrag
Youcheng Sun
Lucas C. Cordeiro
26
48
0
24 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
39
82
0
19 May 2023
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
Erik Nijkamp
A. Ghobadzadeh
Caiming Xiong
Silvio Savarese
Yingbo Zhou
152
164
0
03 May 2023
CodeLMSec Benchmark: Systematically Evaluating and Finding Security
  Vulnerabilities in Black-Box Code Language Models
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models
Hossein Hajipour
Keno Hassler
Thorsten Holz
Lea Schonherr
Mario Fritz
ELM
40
20
0
08 Feb 2023
Few-shot training LLMs for project-specific code-summarization
Few-shot training LLMs for project-specific code-summarization
Toufique Ahmed
Prem Devanbu
179
213
0
09 Jul 2022
A Systematic Evaluation of Large Language Models of Code
A Systematic Evaluation of Large Language Models of Code
Frank F. Xu
Uri Alon
Graham Neubig
Vincent J. Hellendoorn
ELM
ALM
204
631
0
26 Feb 2022
Learning to Superoptimize Real-world Programs
Learning to Superoptimize Real-world Programs
Alex Shypula
Pengcheng Yin
Jeremy Lacomis
Claire Le Goues
Edward N. Schwartz
Graham Neubig
NAI
107
10
0
28 Sep 2021
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for
  Code Understanding and Generation
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
238
1,489
0
02 Sep 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding
  and Generation
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
201
853
0
09 Feb 2021
Previous
123