ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.08227
  4. Cited By
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural
  Code Generation

MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation

17 August 2022
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
Donald Pinckney
Ming-Ho Yee
Yangtian Zi
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
    ELM
ArXivPDFHTML

Papers citing "MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation"

50 / 70 papers shown
Title
On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
Haoyuan Wu
Rui Ming
Jilong Gao
Hangyu Zhao
Xueyi Chen
Yikai Yang
Haisheng Zheng
Zhuolun He
Bei Yu
13
0
0
19 May 2025
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
Yunhui Xia
Wei Shen
Yan Wang
Jason Klein Liu
Huifeng Sun
Siyue Wu
Jian Hu
Xiaolong Xu
AI4TS
30
1
0
20 Apr 2025
Iterative Self-Training for Code Generation via Reinforced Re-Ranking
Iterative Self-Training for Code Generation via Reinforced Re-Ranking
Nikita Sorokin
I. Sedykh
Valentin Malykh
31
0
0
13 Apr 2025
RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing
Yiqing Xie
Alex Xie
Divyanshu Sheth
Pengfei Liu
Daniel Fried
Carolyn Rose
LRM
62
0
0
10 Mar 2025
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol
Roham Koohestani
Philippe de Bekker
M. Izadi
VLM
45
0
0
07 Mar 2025
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Ling Team
B. Zeng
Chenyu Huang
Chao Zhang
Changxin Tian
...
Zhaoxin Huan
Zujie Wen
Zhenhang Sun
Zhuoxuan Du
Z. He
MoE
ALM
109
2
0
07 Mar 2025
Deep-Bench: Deep Learning Benchmark Dataset for Code Generation
Deep-Bench: Deep Learning Benchmark Dataset for Code Generation
Alireza Daghighfarsoodeh
Chung-Yu Wang
Hamed Taherkhani
Melika Sepidband
Mohammad Abdollahi
Hadi Hemmati
Hung Viet Pham
ALM
ELM
96
0
0
26 Feb 2025
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark
Ruizhong Qiu
Weiliang Will Zeng
Hanghang Tong
James Ezick
Christopher Lott
88
16
0
20 Feb 2025
RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation
RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation
C. Zhou
Xinyu Zhang
Dandan Song
Xiancai Chen
Wanli Gu
Huipeng Ma
Yuhang Tian
Hao Fei
Linmei Hu
63
1
0
13 Feb 2025
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
Xin Zhou
Martin Weyssow
Ratnadira Widyasari
Ting Zhang
Junda He
Yunbo Lyu
Jianming Chang
Beiqi Zhang
Dan Huang
David Lo
PILM
297
1
0
10 Feb 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zhilin Yang
Zhiqi Huang
Zihao Huang
Ziyao Xu
Zhengyuan Yang
VLM
ALM
OffRL
AI4TS
LRM
117
150
0
22 Jan 2025
How Should We Build A Benchmark? Revisiting 274 Code-Related Benchmarks For LLMs
How Should We Build A Benchmark? Revisiting 274 Code-Related Benchmarks For LLMs
Jialun Cao
Yuk-Kit Chan
Zixuan Ling
Wenxuan Wang
Shuqing Li
...
Pinjia He
Shuai Wang
Zibin Zheng
Michael R. Lyu
Shing-Chi Cheung
ALM
71
1
0
18 Jan 2025
Code LLMs: A Taxonomy-based Survey
Code LLMs: A Taxonomy-based Survey
Nishat Raihan
Christian D. Newman
Marcos Zampieri
97
1
0
11 Dec 2024
A Preliminary Study of Multilingual Code Language Models for Code
  Generation Task Using Translated Benchmarks
A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks
Rohit Dandamudi
Gema Rodríguez-Pérez
ELM
79
0
0
23 Nov 2024
M2rc-Eval: Massively Multilingual Repository-level Code Completion
  Evaluation
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
Jiaheng Liu
Ken Deng
Congnan Liu
Jian Yang
Shukai Liu
...
Zekun Wang
Guoan Zhang
Bangyu Xiang
Wenbo Su
Jian Xu
75
4
0
28 Oct 2024
Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in
  Low-Resource Code
Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code
Jipeng Zhang
Jianshu Zhang
Yuanzhe Li
Renjie Pi
Rui Pan
Runtao Liu
Ziqiang Zheng
Tong Zhang
36
0
0
24 Oct 2024
Do Current Language Models Support Code Intelligence for R Programming Language?
Do Current Language Models Support Code Intelligence for R Programming Language?
ZiXiao Zhao
Fatemeh H. Fard
ELM
47
0
0
10 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
J. Li
Weiyao Lin
VLM
38
1
0
09 Oct 2024
Rule-based Data Selection for Large Language Models
Rule-based Data Selection for Large Language Models
Xiaomin Li
Mingye Gao
Zhiwei Zhang
Chang Yue
Hong Hu
42
5
0
07 Oct 2024
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software
  Domains?
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
John Yang
Carlos E. Jimenez
Alex Zhang
K. Lieret
Joyce Yang
...
Gabriel Synnaeve
Karthik Narasimhan
Diyi Yang
Sida I. Wang
Ofir Press
41
23
0
04 Oct 2024
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
Shaoxiong Ji
Zihao Li
Indraneil Paul
Jaakko Paavola
Peiqin Lin
...
Dayyán O'Brien
Hengyu Luo
Hinrich Schütze
Jörg Tiedemann
Barry Haddow
CLL
43
3
0
26 Sep 2024
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
Mingjie Liu
Yun-Da Tsai
Wenfei Zhou
Haoxing Ren
SyDa
3DV
45
6
0
19 Sep 2024
Qwen2.5-Coder Technical Report
Qwen2.5-Coder Technical Report
Binyuan Hui
Jian Yang
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
...
Fei Huang
Xingzhang Ren
Xuancheng Ren
Jingren Zhou
Junyang Lin
OSLM
75
212
0
18 Sep 2024
CORE-Bench: Fostering the Credibility of Published Research Through a
  Computational Reproducibility Agent Benchmark
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Zachary S. Siegel
Sayash Kapoor
Nitya Nagdir
Benedikt Stroebl
Arvind Narayanan
34
9
0
17 Sep 2024
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research
  Repositories
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories
Ben Bogin
Kejuan Yang
Shashank Gupta
Kyle Richardson
Erin Bransom
Peter Clark
Ashish Sabharwal
Tushar Khot
ELM
LRM
47
10
0
11 Sep 2024
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training
  Data
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data
Hossein Hajipour
Lea Schönherr
Thorsten Holz
Mario Fritz
AAML
SyDa
26
0
0
10 Sep 2024
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
  at Scale
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale
Huy N. Phan
Phong X. Nguyen
Nghi D. Q. Bui
LLMAG
33
12
0
09 Sep 2024
Multi-Programming Language Ensemble for Code Generation in Large
  Language Model
Multi-Programming Language Ensemble for Code Generation in Large Language Model
Tengfei Xue
Xuefeng Li
Tahir Azim
Roman Smirnov
Jianhui Yu
Arash Sadrieh
Babak Pahlavan
21
2
0
06 Sep 2024
CodeJudge-Eval: Can Large Language Models be Good Judges in Code
  Understanding?
CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding?
Yuwei Zhao
Ziyang Luo
Yuchen Tian
Hongzhan Lin
Weixiang Yan
Annan Li
Jing Ma
ELM
ALM
LRM
50
8
0
20 Aug 2024
Practical Attacks against Black-box Code Completion Engines
Practical Attacks against Black-box Code Completion Engines
Slobodan Jenko
Jingxuan He
Niels Mündler
Mark Vero
Martin Vechev
ELM
AAML
SILM
32
3
0
05 Aug 2024
ArchCode: Incorporating Software Requirements in Code Generation with
  Large Language Models
ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models
Hojae Han
Jaejin Kim
Jaeseok Yoo
Youngwon Lee
Seung-won Hwang
26
0
0
02 Aug 2024
On Leakage of Code Generation Evaluation Datasets
On Leakage of Code Generation Evaluation Datasets
Alexandre Matton
Tom Sherborne
Dennis Aumiller
Elena Tommasone
Milad Alizadeh
Jingyi He
Raymond Ma
Maxime Voisin
Ellen Gilsenan-McMahon
Matthias Gallé
40
16
0
10 Jul 2024
Narrow Transformer: Starcoder-Based Java-LM For Desktop
Narrow Transformer: Starcoder-Based Java-LM For Desktop
Kamalkumar Rathinasamy
Balaji A J
Ankush Kumar
Gagan Gayari
Harshini K
Rajab Ali Mondal
S. SreenivasaRaghavanK
Swayam Singh
43
1
0
04 Jul 2024
ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages
ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages
Mehant Kammakomati
Sameer Pimparkhede
Srikanth G. Tamilselvam
Prince Kumar
Pushpak Bhattacharyya
ALM
40
0
0
03 Jul 2024
UniCoder: Scaling Code Large Language Model via Universal Code
UniCoder: Scaling Code Large Language Model via Universal Code
Tao Sun
Linzheng Chai
Jian Yang
Yuwei Yin
Hongcheng Guo
Jiaheng Liu
Bing Wang
Liqun Yang
Zhoujun Li
OffRL
LRM
68
16
0
24 Jun 2024
Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative
  Models
Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models
Sanjay Vishwakarma
Francis Harkins
Siddharth Golecha
Vishal Sharathchandra Bajpe
Nicolas Dupuis
Luca Buratti
David Kremer
Ismael Faro
Ruchir Puri
Juan Cruz-Benito
ELM
50
3
0
20 Jun 2024
Benchmarks and Metrics for Evaluations of Code Generation: A Critical
  Review
Benchmarks and Metrics for Evaluations of Code Generation: A Critical Review
Debalina Ghosh Paul
Hong Zhu
Ian Bayley
ALM
ELM
39
9
0
18 Jun 2024
ScenEval: A Benchmark for Scenario-Based Evaluation of Code Generation
ScenEval: A Benchmark for Scenario-Based Evaluation of Code Generation
Debalina Ghosh Paul
Hong Zhu
Ian Bayley
35
2
0
18 Jun 2024
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with
  Ko-H5 Benchmark
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
Chanjun Park
Hyeonwoo Kim
Dahyun Kim
Seonghwan Cho
Sanghoon Kim
Sukyung Lee
Yungi Kim
Hwalsuk Lee
ELM
ALM
43
14
0
31 May 2024
AutoCoder: Enhancing Code Large Language Model with
  \textsc{AIEV-Instruct}
AutoCoder: Enhancing Code Large Language Model with \textsc{AIEV-Instruct}
Bin Lei
Yuchen Li
Qiuwu Chen
SyDa
ALM
ELM
36
6
0
23 May 2024
On the Limitations of Embedding Based Methods for Measuring Functional
  Correctness for Code Generation
On the Limitations of Embedding Based Methods for Measuring Functional Correctness for Code Generation
Atharva Naik
46
2
0
26 Apr 2024
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging
  Upcycled Mixture-of-Experts
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
Yifeng Ding
Jiawei Liu
Yuxiang Wei
Terry Yue Zhuo
Lingming Zhang
ALM
MoE
44
3
0
23 Apr 2024
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
Jiawei Guo
Ziming Li
Xueling Liu
Kaijing Ma
Tianyu Zheng
...
Xingwei Qu
Xiang Yue
Ge Zhang
Wenhu Chen
Jie Fu
KELM
59
12
0
04 Apr 2024
SDSAT: Accelerating LLM Inference through Speculative Decoding with
  Semantic Adaptive Tokens
SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens
Chengbo Liu
Yong Zhu
23
0
0
27 Mar 2024
Exploring Language Model's Code Generation Ability with Auxiliary
  Functions
Exploring Language Model's Code Generation Ability with Auxiliary Functions
Seonghyeon Lee
Sanghwan Jang
Seongbo Jang
Dongha Lee
Hwanjo Yu
ALM
37
2
0
15 Mar 2024
LiveCodeBench: Holistic and Contamination Free Evaluation of Large
  Language Models for Code
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain
King Han
Alex Gu
Wen-Ding Li
Fanjia Yan
Tianjun Zhang
Sida I. Wang
Armando Solar-Lezama
Koushik Sen
Ion Stoica
ELM
36
280
0
12 Mar 2024
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Linyuan Gong
Sida Wang
Mostafa Elhoushi
Alvin Cheung
32
15
0
07 Mar 2024
IRCoder: Intermediate Representations Make Language Models Robust
  Multilingual Code Generators
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators
Indraneil Paul
Goran Glavas
Iryna Gurevych
42
13
0
06 Mar 2024
CodeMind: A Framework to Challenge Large Language Models for Code
  Reasoning
CodeMind: A Framework to Challenge Large Language Models for Code Reasoning
Changshu Liu
Shizhuo Dylan Zhang
Ali Reza Ibrahimzada
Reyhaneh Jabbarvand
ELM
ReCod
LRM
33
0
0
15 Feb 2024
EffiBench: Benchmarking the Efficiency of Automatically Generated Code
EffiBench: Benchmarking the Efficiency of Automatically Generated Code
Dong Huang
Yuhao Qing
Weiyi Shang
Heming Cui
Jie M. Zhang
85
31
0
03 Feb 2024
12
Next