Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.06161
Cited By
StarCoder: may the source be with you!
9 May 2023
Raymond Li
Loubna Ben Allal
Yangtian Zi
Niklas Muennighoff
Denis Kocetkov
Chenghao Mou
Marc Marone
Christopher Akiki
Jia Li
Jenny Chim
Qian Liu
Evgenii Zheltonozhskii
Terry Yue Zhuo
Thomas Wang
Olivier Dehaene
Mishig Davaadorj
J. Lamy-Poirier
João Monteiro
Oleh Shliazhko
Nicolas Angelard-Gontier
Nicholas Meade
A. Zebaze
Ming-Ho Yee
Logesh Kumar Umapathi
Jian Zhu
Benjamin Lipkin
Muhtasham Oblokulov
Zhiruo Wang
Rudra Murthy
Jason T Stillerman
S. Patel
Dmitry Abulkhanov
Marco Zocca
Manan Dey
Zhihan Zhang
N. Fahmy
Urvashi Bhattacharyya
W. Yu
Swayam Singh
Sasha Luccioni
Paulo Villegas
M. Kunakov
Fedor Zhdanov
Manuel Romero
Tony Lee
Nadav Timor
Jennifer Ding
Claire Schlesinger
Hailey Schoelkopf
Jana Ebert
Tri Dao
Mayank Mishra
A. Gu
Jennifer Robinson
Carolyn Jane Anderson
Brendan Dolan-Gavitt
Danish Contractor
Siva Reddy
Daniel Fried
Dzmitry Bahdanau
Yacine Jernite
Carlos Muñoz Ferrandis
Sean M. Hughes
Thomas Wolf
Arjun Guha
Leandro von Werra
H. D. Vries
Re-assign community
ArXiv
PDF
HTML
Papers citing
"StarCoder: may the source be with you!"
50 / 119 papers shown
Title
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Md Tahmid Rahman Laskar
Sawsan Alqahtani
M Saiful Bari
Mizanur Rahman
Mohammad Abdullah Matin Khan
...
Chee Wei Tan
Md. Rizwan Parvez
Enamul Hoque
Chenyu You
Jimmy Huang
ELM
ALM
31
28
0
04 Jul 2024
MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning
Zhenlong Dai
Chang Yao
WenKang Han
Ying Yuan
Zhipeng Gao
Jingyuan Chen
26
11
0
25 Jun 2024
AnnotatedTables: A Large Tabular Dataset with Language Model Annotations
Yaojie Hu
Ilias Fountalis
Jin Tian
N. Vasiloglou
LMTD
36
4
0
24 Jun 2024
CodeRAG-Bench: Can Retrieval Augment Code Generation?
Zora Zhiruo Wang
Akari Asai
Xinyan Velocity Yu
Frank F. Xu
Yiqing Xie
Graham Neubig
Daniel Fried
RALM
80
30
0
20 Jun 2024
Prose-to-P4: Leveraging High Level Languages
Mihai-Valentin Dumitru
Vlad-Andrei Bădoiu
C. Raiciu
34
0
0
19 Jun 2024
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
Leonidas Gee
Milan Gritta
Gerasimos Lampouras
Ignacio Iacobacci
28
10
0
18 Jun 2024
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
37
33
0
17 Jun 2024
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Zijin Hong
Zheng Yuan
Qinggang Zhang
Hao Chen
Junnan Dong
Feiran Huang
Xiao Huang
77
51
0
12 Jun 2024
Leveraging Large Language Models for Efficient Failure Analysis in Game Development
Leonardo Marini
Linus Gisslén
Alessandro Sestini
54
0
0
11 Jun 2024
Kotlin ML Pack: Technical Report
Sergey Titov
Mikhail Evtikhiev
Anton Shapkin
Oleg Smirnov
Sergei Boytsov
...
Dariia Karaeva
Maksim Sheptyakov
Mikhail Arkhipov
T. Bryksin
Egor Bogomolov
32
0
0
29 May 2024
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
ALM
LM&MA
ELM
LRM
52
48
0
21 May 2024
LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL System on EHRs
Yongrae Jo
Seongyun Lee
Minju Seo
Sung Ju Hwang
Moontae Lee
42
3
0
18 May 2024
Performance-Aligned LLMs for Generating Fast Code
Daniel Nichols
Pranav Polasam
Harshitha Menon
Aniruddha Marathe
T. Gamblin
A. Bhatele
35
8
0
29 Apr 2024
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
Parshin Shojaee
Kazem Meidani
Shashank Gupta
A. Farimani
Chandan K. Reddy
42
15
0
29 Apr 2024
SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM's SVG Editing Capabilities
Kunato Nishina
Yusuke Matsui
40
8
0
21 Apr 2024
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Yikang Shen
Zhen Guo
Tianle Cai
Zengyi Qin
MoE
ALM
46
28
0
11 Apr 2024
HDLdebugger: Streamlining HDL debugging with Large Language Models
Xufeng Yao
Haoyang Li
T. H. Chan
Wenyi Xiao
Mingxuan Yuan
Yu Huang
Lei Chen
Bei Yu
24
19
0
18 Mar 2024
Semi-Instruct: Bridging Natural-Instruct and Self-Instruct for Code Large Language Models
Xianzhen Luo
Qingfu Zhu
Zhiming Zhang
Xu Wang
Qing Yang
Dongliang Xu
Wanxiang Che
ALM
32
2
0
01 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
52
8
0
29 Feb 2024
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Alex Zhuang
Ge Zhang
Tianyu Zheng
Xinrun Du
Junjie Wang
Weiming Ren
Stephen W. Huang
Jie Fu
Xiang Yue
Wenhu Chen
LMTD
49
14
0
26 Feb 2024
Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step
Li Zhong
Zilong Wang
Jingbo Shang
29
48
0
25 Feb 2024
CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation
Jueon Eom
Seyeon Jeong
Taekyoung Kwon
32
7
0
19 Feb 2024
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
Zhen Guo
Adriana Meza Soria
Wei Sun
Songlin Yang
Yikang Shen
ELM
ALM
55
1
0
14 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
134
371
0
09 Feb 2024
Text-to-Code Generation with Modality-relative Pre-training
Fenia Christopoulou
Guchun Zhang
Gerasimos Lampouras
AI4TS
23
1
0
08 Feb 2024
On the Standardization of Behavioral Use Clauses and Their Adoption for Responsible Licensing of AI
Daniel J. McDuff
Tim Korjakow
Scott Cambo
Jesse Josua Benjamin
Jenny Lee
...
Aaron Gokaslan
Alek Tarkowski
Joseph Lindley
A. F. Cooper
Danish Contractor
MedIm
40
7
0
07 Feb 2024
UniTSyn: A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing
Yifeng He
Jiabo Huang
Yuyang Rong
Yiwen Guo
Ethan Wang
Hao Chen
26
4
0
04 Feb 2024
The Landscape and Challenges of HPC Research and LLMs
Le Chen
Nesreen K. Ahmed
Akashnil Dutta
Arijit Bhattacharjee
Sixing Yu
...
Vy A. Vo
J. P. Muñoz
Ted Willke
Tim Mattson
Ali Jannesari
AI4CE
48
20
0
03 Feb 2024
OMPGPT: A Generative Pre-trained Transformer Model for OpenMP
Le Chen
Arijit Bhattacharjee
Nesreen Ahmed
N. Hasabnis
Gal Oren
Vy A. Vo
Ali Jannesari
VLM
31
11
0
28 Jan 2024
Temporal Blind Spots in Large Language Models
Jonas Wallat
Adam Jatowt
Avishek Anand
38
3
0
22 Jan 2024
Knowledge Fusion of Large Language Models
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
40
63
0
19 Jan 2024
JumpCoder: Go Beyond Autoregressive Coder via Online Modification
Mouxiang Chen
Hao Tian
Zhongxi Liu
Xiaoxue Ren
Jianling Sun
SyDa
KELM
43
2
0
15 Jan 2024
DebugBench: Evaluating Debugging Capability of Large Language Models
Runchu Tian
Yining Ye
Yujia Qin
Xin Cong
Yankai Lin
...
Yesai Wu
Haotian Hui
Weichuan Liu
Zhiyuan Liu
Maosong Sun
ELM
40
28
0
09 Jan 2024
KernelGPT: Enhanced Kernel Fuzzing via Large Language Models
Chenyuan Yang
Zijie Zhao
Lingming Zhang
25
13
0
31 Dec 2023
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention
Kaiqiang Song
Xiaoyang Wang
Sangwoo Cho
Xiaoman Pan
Dong Yu
34
7
0
14 Dec 2023
Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations
Zilu Tang
Mayank Agarwal
Alex Shypula
Bailin Wang
Derry Wijaya
Jie Chen
Yoon Kim
LRM
37
15
0
13 Nov 2023
CompCodeVet: A Compiler-guided Validation and Enhancement Approach for Code Dataset
Le Chen
Arijit Bhattacharjee
Nesreen K. Ahmed
N. Hasabnis
Gal Oren
Bin Lei
Ali Jannesari
LRM
34
3
0
11 Nov 2023
AdaLomo: Low-memory Optimization with Adaptive Learning Rate
Kai Lv
Hang Yan
Qipeng Guo
Haijun Lv
Xipeng Qiu
ODL
27
20
0
16 Oct 2023
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules
Hung Le
Hailin Chen
Amrita Saha
Akash Gokul
Doyen Sahoo
Chenyu You
LRM
28
42
0
13 Oct 2023
Fine-tune Language Models to Approximate Unbiased In-context Learning
Timothy Chu
Zhao Song
Chiwun Yang
27
15
0
05 Oct 2023
FELM: Benchmarking Factuality Evaluation of Large Language Models
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
38
33
0
01 Oct 2023
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
Jonas Belouadi
Anne Lauscher
Steffen Eger
21
28
0
30 Sep 2023
Cognitive Architectures for Language Agents
T. Sumers
Shunyu Yao
Karthik Narasimhan
Thomas L. Griffiths
LLMAG
LM&Ro
56
154
0
05 Sep 2023
Bias Testing and Mitigation in LLM-based Code Generation
Dong Huang
Qingwen Bu
Jie M. Zhang
Xiaofei Xie
Junjie Chen
Heming Cui
48
20
0
03 Sep 2023
On the Impact of Language Selection for Training and Evaluating Programming Language Models
J. Katzy
M. Izadi
A. van Deursen
53
5
0
25 Aug 2023
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
42
78
0
04 Jul 2023
Structured Thoughts Automaton: First Formalized Execution Model for Auto-Regressive Language Models
T. Vanderbruggen
C. Liao
P. Pirkelbauer
Pei-Hung Lin
LRM
ALM
24
2
0
16 Jun 2023
Is Self-Repair a Silver Bullet for Code Generation?
Theo X. Olausson
J. Inala
Chenglong Wang
Jianfeng Gao
Armando Solar-Lezama
LRM
28
108
0
16 Jun 2023
The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code
Xiao Liu
Da Yin
Chen Zhang
Yansong Feng
Dongyan Zhao
ELM
ReLM
ReCod
LRM
40
20
0
30 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
32
69
0
25 May 2023
Previous
1
2
3
Next