Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.14868
Cited By
Multi-lingual Evaluation of Code Generation Models
26 October 2022
Ben Athiwaratkun
Sanjay Krishna Gouda
Zijian Wang
Xiaopeng Li
Yuchen Tian
Ming Tan
Wasi Uddin Ahmad
Shiqi Wang
Qing Sun
Mingyue Shang
Sujan Kumar Gonugondla
Hantian Ding
Varun Kumar
Nathan Fulton
A. Farahani
Siddharth Jain
Robert Giaquinto
Haifeng Qian
M. K. Ramanathan
Ramesh Nallapati
Baishakhi Ray
Parminder Bhatia
Sudipta Sengupta
Dan Roth
Bing Xiang
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-lingual Evaluation of Code Generation Models"
28 / 28 papers shown
Title
QiMeng-TensorOp: Automatically Generating High-Performance Tensor Operators with Hardware Primitives
X. Zhang
Shaohui Peng
Qirui Zhou
Yuanbo Wen
Qi Guo
...
Ke Gao
Chen Zhao
Yanjun Wu
Yunji Chen
Ling Li
VLM
39
0
0
08 May 2025
OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification
Shangyu Li
Juyong Jiang
Tiancheng Zhao
Jiasi Shen
49
0
0
29 Apr 2025
ResearchCodeAgent: An LLM Multi-Agent System for Automated Codification of Research Methodologies
Shubham Gandhi
Dhruv Shah
Manasi S. Patwardhan
L. Vig
Gautam M. Shroff
LLMAG
AI4CE
143
0
0
28 Apr 2025
A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs
Musfiqur Rahman
SayedHassan Khatoonabadi
Emad Shihab
ALM
39
0
0
22 Apr 2025
Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference
Thanh Le-Cong
Bach Le
Toby Murray
LRM
47
1
0
22 Feb 2025
SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors
Bohan Lyu
Siqiao Huang
Zichen Liang
Qi-An Sun
Jiaming Zhang
ELM
LRM
60
0
0
16 Feb 2025
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation
Zhaojian Yu
Yilun Zhao
Arman Cohan
Xiao-Ping Zhang
LRM
36
2
0
03 Jan 2025
A Deep Dive Into Large Language Model Code Generation Mistakes: What and Why?
QiHong Chen
Jiawei Li
Jiecheng Deng
Jiachen Yu
Justin Tian Jin Chen
Iftekhar Ahmed
56
0
0
03 Nov 2024
Kotlin ML Pack: Technical Report
Sergey Titov
Mikhail Evtikhiev
Anton Shapkin
Oleg Smirnov
Sergei Boytsov
...
Dariia Karaeva
Maksim Sheptyakov
Mikhail Arkhipov
T. Bryksin
Egor Bogomolov
32
0
0
29 May 2024
Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance
Yewei Song
Cedric Lothritz
Daniel Tang
Tegawende F. Bissyande
Jacques Klein
46
9
0
12 Apr 2024
Inherent limitations of LLMs regarding spatial information
He Yan
Xinyao Hu
Xiangpeng Wan
Chengyu Huang
Kai Zou
Shiqi Xu
LRM
28
2
0
05 Dec 2023
Can LLMs Patch Security Issues?
Kamel Alrashedy
Abdullah Aljasser
Pradyumna Tambwekar
Matthew Gombolay
AAML
21
6
0
13 Nov 2023
Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations
Zilu Tang
Mayank Agarwal
Alex Shypula
Bailin Wang
Derry Wijaya
Jie Chen
Yoon Kim
LRM
37
15
0
13 Nov 2023
Bias Testing and Mitigation in LLM-based Code Generation
Dong Huang
Qingwen Bu
Jie M. Zhang
Xiaofei Xie
Junjie Chen
Heming Cui
43
20
0
03 Sep 2023
A Lightweight Framework for High-Quality Code Generation
Mohammed Latif Siddiq
B.K. Casey
Joanna C. S. Santos
41
17
0
17 Jul 2023
Coarse-Tuning Models of Code with Reinforcement Learning Feedback
Abhinav C. P. Jain
Chima Adiole
Swarat Chaudhuri
Thomas W. Reps
Chris Jermaine Rice University
ALM
19
2
0
25 May 2023
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Shuyan Zhou
Uri Alon
Sumit Agarwal
Graham Neubig
ELM
ALM
31
98
0
10 Feb 2023
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
...
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
ELM
24
81
0
17 Aug 2022
Grounded Copilot: How Programmers Interact with Code-Generating Models
Shraddha Barke
M. James
Nadia Polikarpova
164
212
0
30 Jun 2022
A Systematic Evaluation of Large Language Models of Code
Frank F. Xu
Uri Alon
Graham Neubig
Vincent J. Hellendoorn
ELM
ALM
204
631
0
26 Feb 2022
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
Jing Zhang
Yue Zhang
174
86
0
06 Dec 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Xiao Liu
Kaixuan Ji
Yicheng Fu
Weng Lam Tam
Zhengxiao Du
Zhilin Yang
Jie Tang
VLM
238
806
0
14 Oct 2021
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
235
1,489
0
02 Sep 2021
AVATAR: A Parallel Corpus for Java-Python Program Translation
W. Ahmad
Md Golam Rahman Tushar
Saikat Chakraborty
Kai-Wei Chang
35
78
0
26 Aug 2021
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
208
624
0
20 May 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,848
0
18 Apr 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
198
1,105
0
09 Feb 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
1