Title
QiMeng-TensorOp: Automatically Generating High-Performance Tensor Operators with Hardware Primitives X. Zhang Shaohui Peng Qirui Zhou Yuanbo Wen Qi Guo ... Ke Gao Chen Zhao Yanjun Wu Yunji Chen Ling Li VLM 39 0 0 08 May 2025
OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification Shangyu Li Juyong Jiang Tiancheng Zhao Jiasi Shen 49 0 0 29 Apr 2025
ResearchCodeAgent: An LLM Multi-Agent System for Automated Codification of Research Methodologies Shubham Gandhi Dhruv Shah Manasi S. Patwardhan L. Vig Gautam M. Shroff LLMAG AI4CE 143 0 0 28 Apr 2025
A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs Musfiqur Rahman SayedHassan Khatoonabadi Emad Shihab ALM 39 0 0 22 Apr 2025
Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference Thanh Le-Cong Bach Le Toby Murray LRM 47 1 0 22 Feb 2025
SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors Bohan Lyu Siqiao Huang Zichen Liang Qi-An Sun Jiaming Zhang ELM LRM 60 0 0 16 Feb 2025
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Zhaojian Yu Yilun Zhao Arman Cohan Xiao-Ping Zhang LRM 36 2 0 03 Jan 2025
A Deep Dive Into Large Language Model Code Generation Mistakes: What and Why? QiHong Chen Jiawei Li Jiecheng Deng Jiachen Yu Justin Tian Jin Chen Iftekhar Ahmed 56 0 0 03 Nov 2024
Kotlin ML Pack: Technical Report Sergey Titov Mikhail Evtikhiev Anton Shapkin Oleg Smirnov Sergei Boytsov ... Dariia Karaeva Maksim Sheptyakov Mikhail Arkhipov T. Bryksin Egor Bogomolov 32 0 0 29 May 2024
Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance Yewei Song Cedric Lothritz Daniel Tang Tegawende F. Bissyande Jacques Klein 46 9 0 12 Apr 2024
Inherent limitations of LLMs regarding spatial information He Yan Xinyao Hu Xiangpeng Wan Chengyu Huang Kai Zou Shiqi Xu LRM 28 2 0 05 Dec 2023
Can LLMs Patch Security Issues? Kamel Alrashedy Abdullah Aljasser Pradyumna Tambwekar Matthew Gombolay AAML 21 6 0 13 Nov 2023
Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations Zilu Tang Mayank Agarwal Alex Shypula Bailin Wang Derry Wijaya Jie Chen Yoon Kim LRM 37 15 0 13 Nov 2023
Bias Testing and Mitigation in LLM-based Code Generation Dong Huang Qingwen Bu Jie M. Zhang Xiaofei Xie Junjie Chen Heming Cui 43 20 0 03 Sep 2023
A Lightweight Framework for High-Quality Code Generation Mohammed Latif Siddiq B.K. Casey Joanna C. S. Santos 41 17 0 17 Jul 2023
Coarse-Tuning Models of Code with Reinforcement Learning Feedback Abhinav C. P. Jain Chima Adiole Swarat Chaudhuri Thomas W. Reps Chris Jermaine Rice University ALM 19 2 0 25 May 2023
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code Shuyan Zhou Uri Alon Sumit Agarwal Graham Neubig ELM ALM 31 98 0 10 Feb 2023
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation Federico Cassano John Gouwar Daniel Nguyen S. Nguyen Luna Phipps-Costin ... Carolyn Jane Anderson Molly Q. Feldman Arjun Guha Michael Greenberg Abhinav Jangda ELM 24 81 0 17 Aug 2022
Grounded Copilot: How Programmers Interact with Code-Generating Models Shraddha Barke M. James Nadia Polikarpova 164 212 0 30 Jun 2022
A Systematic Evaluation of Large Language Models of Code Frank F. Xu Uri Alon Graham Neubig Vincent J. Hellendoorn ELM ALM 204 631 0 26 Feb 2022
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation Kaustubh D. Dhole Varun Gangal Sebastian Gehrmann Aadesh Gupta Zhenhao Li ... Tianbao Xie Usama Yaseen Michael A. Yee Jing Zhang Yue Zhang 174 86 0 06 Dec 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks Xiao Liu Kaixuan Ji Yicheng Fu Weng Lam Tam Zhengxiao Du Zhilin Yang Jie Tang VLM 238 806 0 14 Oct 2021
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation Yue Wang Weishi Wang Shafiq R. Joty S. Hoi 235 1,489 0 02 Sep 2021
AVATAR: A Parallel Corpus for Java-Python Program Translation W. Ahmad Md Golam Rahman Tushar Saikat Chakraborty Kai-Wei Chang 35 78 0 26 Aug 2021
Measuring Coding Challenge Competence With APPS Dan Hendrycks Steven Basart Saurav Kadavath Mantas Mazeika Akul Arora ... Collin Burns Samir Puranik Horace He D. Song Jacob Steinhardt ELM AIMat ALM 208 624 0 20 May 2021
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 280 3,848 0 18 Apr 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation Shuai Lu Daya Guo Shuo Ren Junjie Huang Alexey Svyatkovskiy ... Nan Duan Neel Sundaresan Shao Kun Deng Shengyu Fu Shujie Liu ELM 198 1,105 0 09 Feb 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 245 1,821 0 17 Sep 2019