ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.08227
  4. Cited By
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural
  Code Generation
v1v2v3v4 (latest)

MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation

17 August 2022
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
Donald Pinckney
Ming-Ho Yee
Yangtian Zi
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
    ELM
ArXiv (abs)PDFHTML

Papers citing "MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation"

25 / 25 papers shown
Title
SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation
SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation
Ivan Petrukha
Yana Kurliak
Nataliia Stulova
ALMELM
43
0
0
30 May 2025
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
Manish Shetty
Naman Jain
Jinjian Liu
Vijay Kethanaboyina
Koushik Sen
Ion Stoica
ELM
80
1
0
29 May 2025
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought
Tencent Hunyuan Team
Ao Liu
Botong Zhou
Can Xu
Chayse Zhou
...
Bingxin Qu
Bolin Ni
Boyu Wu
Chen Li
Cheng-peng Jiang
MoELRMAI4CE
163
0
0
21 May 2025
On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
Haoyuan Wu
Rui Ming
Jilong Gao
Hangyu Zhao
Xueyi Chen
Yikai Yang
Haisheng Zheng
Zhuolun He
Bei Yu
121
0
0
19 May 2025
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol
Roham Koohestani
Philippe de Bekker
Maliheh Izadi
VLM
115
0
0
07 Mar 2025
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Ling Team
B. Zeng
Chenyu Huang
Chao Zhang
Changxin Tian
...
Zhaoxin Huan
Zujie Wen
Zhenhang Sun
Zhuoxuan Du
Z. He
MoEALM
198
5
0
07 Mar 2025
Deep-Bench: Deep Learning Benchmark Dataset for Code Generation
Deep-Bench: Deep Learning Benchmark Dataset for Code Generation
Alireza Daghighfarsoodeh
Chung-Yu Wang
Hamed Taherkhani
Melika Sepidband
Mohammad Abdollahi
Hadi Hemmati
Hung Viet Pham
ALMELM
132
0
0
26 Feb 2025
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark
Ruizhong Qiu
Weiliang Will Zeng
Hanghang Tong
James Ezick
Christopher Lott
206
23
0
20 Feb 2025
RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation
RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation
C. Zhou
Xinyu Zhang
Dandan Song
Xiancai Chen
Wanli Gu
Huipeng Ma
Yuhang Tian
Hao Fei
Linmei Hu
98
2
0
13 Feb 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zihao Huang
Ziyao Xu
Zhiyong Yang
Zonghan Yang
Zongyu Lin
OffRLALMAI4TSVLMLRM
351
338
0
22 Jan 2025
Code LLMs: A Taxonomy-based Survey
Code LLMs: A Taxonomy-based Survey
Nishat Raihan
Christian D. Newman
Marcos Zampieri
148
1
0
11 Dec 2024
Do Current Language Models Support Code Intelligence for R Programming Language?
Do Current Language Models Support Code Intelligence for R Programming Language?
ZiXiao Zhao
Fatemeh H. Fard
ELM
112
0
0
10 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
Jia-Nan Li
Weiyao Lin
VLM
104
2
0
09 Oct 2024
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
Shaoxiong Ji
Zihao Li
Indraneil Paul
Jaakko Paavola
Peiqin Lin
...
Dayyán O'Brien
Hengyu Luo
Hinrich Schütze
Jörg Tiedemann
Barry Haddow
CLL
120
7
0
26 Sep 2024
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
Mingjie Liu
Yun-Da Tsai
Wenfei Zhou
Haoxing Ren
SyDa3DV
120
17
0
19 Sep 2024
Black-Box Adversarial Attacks on LLM-Based Code Completion
Black-Box Adversarial Attacks on LLM-Based Code Completion
Slobodan Jenko
Jingxuan He
Niels Mündler
Mark Vero
Martin Vechev
AAMLELMSILM
132
3
0
05 Aug 2024
ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages
ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages
Mehant Kammakomati
Sameer Pimparkhede
Srikanth G. Tamilselvam
Praveen Venkateswaran
Pushpak Bhattacharyya
ALM
135
0
0
03 Jul 2024
Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative
  Models
Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models
Sanjay Vishwakarma
Francis Harkins
Siddharth Golecha
Vishal Sharathchandra Bajpe
Nicolas Dupuis
Luca Buratti
David Kremer
Ismael Faro
Ruchir Puri
Juan Cruz-Benito
ELM
82
3
0
20 Jun 2024
AutoCoder: Enhancing Code Large Language Model with
  \textsc{AIEV-Instruct}
AutoCoder: Enhancing Code Large Language Model with \textsc{AIEV-Instruct}
Bin Lei
Tianyu Shi
Qiuwu Chen
SyDaALMELM
70
7
0
23 May 2024
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
Jiawei Guo
Ziming Li
Xueling Liu
Kaijing Ma
Tianyu Zheng
...
Xingwei Qu
Xiang Yue
Ge Zhang
Wenhu Chen
Jie Fu
KELM
163
14
0
04 Apr 2024
LiveCodeBench: Holistic and Contamination Free Evaluation of Large
  Language Models for Code
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain
King Han
Alex Gu
Wen-Ding Li
Fanjia Yan
Tianjun Zhang
Sida I. Wang
Armando Solar-Lezama
Koushik Sen
Ion Stoica
ELM
151
448
0
12 Mar 2024
JumpCoder: Go Beyond Autoregressive Coder via Online Modification
JumpCoder: Go Beyond Autoregressive Coder via Online Modification
Mouxiang Chen
Hao Tian
Zhongxi Liu
Xiaoxue Ren
Jianling Sun
SyDaKELM
98
2
0
15 Jan 2024
CodeScope: An Execution-based Multilingual Multitask Multidimensional
  Benchmark for Evaluating LLMs on Code Understanding and Generation
CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation
Weixiang Yan
Haitian Liu
Yunkun Wang
Yunzhe Li
Qian Chen
...
Tingyu Lin
Weishan Zhao
Li Zhu
Hari Sundaram
Shuiguang Deng
ELMLRM
144
37
0
14 Nov 2023
Benchmarking Causal Study to Interpret Large Language Models for Source
  Code
Benchmarking Causal Study to Interpret Large Language Models for Source Code
Daniel Rodríguez-Cárdenas
David Nader-Palacio
Dipin Khati
Henry Burke
Denys Poshyvanyk
CMLELM
76
15
0
23 Aug 2023
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code
  Understanding, Generation, Translation and Retrieval
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
Mohammad Abdullah Matin Khan
M Saiful Bari
Xuan Long Do
Weishi Wang
Md. Rizwan Parvez
Shafiq Joty
ALMELM
117
23
0
06 Mar 2023
1