ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.08155
  4. Cited By
CodeBERT: A Pre-Trained Model for Programming and Natural Languages

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

19 February 2020
Zhangyin Feng
Daya Guo
Duyu Tang
Nan Duan
Xiaocheng Feng
Ming Gong
Linjun Shou
Bing Qin
Ting Liu
Daxin Jiang
Ming Zhou
ArXivPDFHTML

Papers citing "CodeBERT: A Pre-Trained Model for Programming and Natural Languages"

50 / 314 papers shown
Title
CodeLMSec Benchmark: Systematically Evaluating and Finding Security
  Vulnerabilities in Black-Box Code Language Models
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models
Hossein Hajipour
Keno Hassler
Thorsten Holz
Lea Schonherr
Mario Fritz
ELM
40
20
0
08 Feb 2023
ChatGPT and Software Testing Education: Promises & Perils
ChatGPT and Software Testing Education: Promises & Perils
Sajed Jalil
Suzzana Rafi
Thomas D. Latoza
Kevin Moran
Wing Lam
ELM
27
172
0
07 Feb 2023
Exploring Data Augmentation for Code Generation Tasks
Exploring Data Augmentation for Code Generation Tasks
Pinzhen Chen
Gerasimos Lampouras
31
9
0
05 Feb 2023
VuLASTE: Long Sequence Model with Abstract Syntax Tree Embedding for
  vulnerability Detection
VuLASTE: Long Sequence Model with Abstract Syntax Tree Embedding for vulnerability Detection
Botong Zhu
Huobin Tan
31
0
0
05 Feb 2023
Measuring The Impact Of Programming Language Distribution
Measuring The Impact Of Programming Language Distribution
Gabriel Orlanski
Kefan Xiao
Xavier Garcia
Jeffrey Hui
Joshua Howland
J. Malmaud
Jacob Austin
Rishah Singh
Michele Catasta
30
28
0
03 Feb 2023
KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program
  Repair
KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program Repair
Nan Jiang
Thibaud Lutellier
Yiling Lou
Lin Tan
Dan Goldwasser
Xinming Zhang
27
43
0
03 Feb 2023
Transformers Meet Directed Graphs
Transformers Meet Directed Graphs
Simon Geisler
Yujia Li
D. Mankowitz
A. Cemgil
Stephan Günnemann
Cosmin Paduraru
27
35
0
31 Jan 2023
Execution-based Code Generation using Deep Reinforcement Learning
Execution-based Code Generation using Deep Reinforcement Learning
Parshin Shojaee
Aneesh Jain
Sindhu Tipirneni
Chandan K. Reddy
25
52
0
31 Jan 2023
Which Features are Learned by CodeBert: An Empirical Study of the
  BERT-based Source Code Representation Learning
Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning
Lan Zhang
Chen Cao
Zhilong Wang
Peng Liu
SSL
14
3
0
20 Jan 2023
Learning Compiler Pass Orders using Coreset and Normalized Value
  Prediction
Learning Compiler Pass Orders using Coreset and Normalized Value Prediction
Youwei Liang
Kevin R. Stone
A. Shameli
Chris Cummins
Mostafa Elhoushi
...
Benoit Steiner
Xiaomeng Yang
P. Xie
Hugh Leather
Yuandong Tian
17
10
0
09 Jan 2023
TrojanPuzzle: Covertly Poisoning Code-Suggestion Models
TrojanPuzzle: Covertly Poisoning Code-Suggestion Models
H. Aghakhani
Wei Dai
Andre Manoel
Xavier Fernandes
Anant Kharkar
Christopher Kruegel
Giovanni Vigna
David E. Evans
B. Zorn
Robert Sim
SILM
29
33
0
06 Jan 2023
Serenity: Library Based Python Code Analysis for Code Completion and
  Automated Machine Learning
Serenity: Library Based Python Code Analysis for Code Completion and Automated Machine Learning
Wenting Zhao
Ibrahim Abdelaziz
Julian T Dolby
Kavitha Srinivas
M. Helali
Essam Mansour
32
0
0
05 Jan 2023
Boosting Neural Networks to Decompile Optimized Binaries
Boosting Neural Networks to Decompile Optimized Binaries
Ying Cao
Ruigang Liang
Kai Chen
Peiwei Hu
31
17
0
03 Jan 2023
Generation-Augmented Query Expansion For Code Retrieval
Generation-Augmented Query Expansion For Code Retrieval
Dong Li
Yelong Shen
Ruoming Jin
Yi Mao
Kuan-Chieh Jackson Wang
Weizhu Chen
RALM
28
8
0
20 Dec 2022
Don't Generate, Discriminate: A Proposal for Grounding Language Models
  to Real-World Environments
Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
Yu Gu
Xiang Deng
Yu-Chuan Su
LLMAG
42
52
0
19 Dec 2022
MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code
  Completion
MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code Completion
Zi Gong
Yinpeng Guo
Pingyi Zhou
Cuiyun Gao
Yasheng Wang
Zenglin Xu
14
8
0
19 Dec 2022
JEMMA: An Extensible Java Dataset for ML4Code Applications
JEMMA: An Extensible Java Dataset for ML4Code Applications
Anjan Karmakar
Miltiadis Allamanis
Romain Robbes
VLM
29
3
0
18 Dec 2022
Plansformer: Generating Symbolic Plans using Transformers
Plansformer: Generating Symbolic Plans using Transformers
Vishal Pallagani
Bharath Muppasani
K. Murugesan
F. Rossi
L. Horesh
Biplav Srivastava
F. Fabiano
Andrea Loreggia
LM&Ro
LLMAG
OffRL
21
35
0
16 Dec 2022
An Empirical Study of Deep Learning Models for Vulnerability Detection
An Empirical Study of Deep Learning Models for Vulnerability Detection
Benjamin Steenhoek
Md. Mahbubur Rahman
Richard Jiles
Wei Le
ELM
AAML
38
79
0
15 Dec 2022
Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability
  Detection
Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection
Benjamin Steenhoek
Hongyang Gao
Wei Le
43
27
0
15 Dec 2022
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for
  Programming Languages
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages
Yekun Chai
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua Wu
32
35
0
13 Dec 2022
Who Evaluates the Evaluators? On Automatic Metrics for Assessing
  AI-based Offensive Code Generators
Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators
Pietro Liguori
Cristina Improta
R. Natella
B. Cukic
Domenico Cotroneo
ELM
36
16
0
12 Dec 2022
DexBERT: Effective, Task-Agnostic and Fine-grained Representation
  Learning of Android Bytecode
DexBERT: Effective, Task-Agnostic and Fine-grained Representation Learning of Android Bytecode
Tiezhu Sun
Kevin Allix
Kisub Kim
Xin Zhou
Dongsun Kim
David Lo
Tegawende F. Bissyande
Jacques Klein
24
11
0
12 Dec 2022
Parameter-Efficient Finetuning of Transformers for Source Code
Parameter-Efficient Finetuning of Transformers for Source Code
Shamil Ayupov
Nadezhda Chirkova
22
17
0
12 Dec 2022
A Survey on Natural Language Processing for Programming
A Survey on Natural Language Processing for Programming
Qingfu Zhu
Xianzhen Luo
Fang Liu
Cuiyun Gao
Wanxiang Che
25
2
0
12 Dec 2022
Evaluating How Fine-tuning on Bimodal Data Effects Code Generation
Evaluating How Fine-tuning on Bimodal Data Effects Code Generation
Gabriel Orlanski
Seonhye Yang
Michael Healy
ALM
21
5
0
15 Nov 2022
MPCFormer: fast, performant and private Transformer inference with MPC
MPCFormer: fast, performant and private Transformer inference with MPC
Dacheng Li
Rulin Shao
Hongyi Wang
Han Guo
Eric P. Xing
Haotong Zhang
15
79
0
02 Nov 2022
A Simple, Yet Effective Approach to Finding Biases in Code Generation
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Spyridon Mouselinos
Mateusz Malinowski
Henryk Michalewski
18
7
0
31 Oct 2022
Poison Attack and Defense on Deep Source Code Processing Models
Poison Attack and Defense on Deep Source Code Processing Models
Jia Li
Zhuo Li
Huangzhao Zhang
Ge Li
Zhi Jin
Xing Hu
Xin Xia
AAML
43
16
0
31 Oct 2022
Multi-lingual Evaluation of Code Generation Models
Multi-lingual Evaluation of Code Generation Models
Ben Athiwaratkun
Sanjay Krishna Gouda
Zijian Wang
Xiaopeng Li
Yuchen Tian
...
Baishakhi Ray
Parminder Bhatia
Sudipta Sengupta
Dan Roth
Bing Xiang
ELM
120
161
0
26 Oct 2022
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black
  Magic?
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste Döderlein
M. Acher
D. Khelladi
B. Combemale
34
33
0
26 Oct 2022
Reading Between the Lines: Modeling User Behavior and Costs in
  AI-Assisted Programming
Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
Hussein Mozannar
Gagan Bansal
Adam Fourney
Eric Horvitz
49
109
0
25 Oct 2022
ObSynth: An Interactive Synthesis System for Generating Object Models
  from Natural Language Specifications
ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications
Alex Gu
Tamara Mitrovska
D. Vélez
Jacob Andreas
Armando Solar-Lezama
SyDa
25
1
0
20 Oct 2022
Code Recommendation for Open Source Software Developers
Code Recommendation for Open Source Software Developers
Yiqiao Jin
Yunsheng Bai
Yanqiao Zhu
Yizhou Sun
Wei Wang
33
24
0
15 Oct 2022
Enriching Biomedical Knowledge for Low-resource Language Through
  Large-Scale Translation
Enriching Biomedical Knowledge for Low-resource Language Through Large-Scale Translation
Long Phan
Tai Dang
H. Tran
Trieu H. Trinh
Vy Phan
Lam D. Chau
Minh-Thang Luong
15
8
0
11 Oct 2022
Leveraging Artificial Intelligence on Binary Code Comprehension
Leveraging Artificial Intelligence on Binary Code Comprehension
Yifan Zhang
37
3
0
11 Oct 2022
Pre-Training Representations of Binary Code Using Contrastive Learning
Pre-Training Representations of Binary Code Using Contrastive Learning
Yifan Zhang
Chen Huang
Yueke Zhang
Kevin Cao
Scott Thomas Andersen
Huajie Shao
Kevin Leach
Yu Huang
50
3
0
11 Oct 2022
Novice Type Error Diagnosis with Natural Language Models
Novice Type Error Diagnosis with Natural Language Models
Chuqin Geng
Haolin Ye
Yixuan Li
Tianyu Han
B. Pientka
X. Si
27
3
0
07 Oct 2022
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
Chen Gong
Zhou Yang
Yunru Bai
Junda He
Jieke Shi
...
Arunesh Sinha
Bowen Xu
Xinwen Hou
David Lo
Guoliang Fan
AAML
OffRL
24
7
0
07 Oct 2022
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models
  for Programming Language Attend Code Structure
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure
Nuo Chen
Qiushi Sun
Renyu Zhu
Xiang Li
Xuesong Lu
Ming Gao
44
10
0
07 Oct 2022
MIXCODE: Enhancing Code Classification by Mixup-Based Data Augmentation
MIXCODE: Enhancing Code Classification by Mixup-Based Data Augmentation
Zeming Dong
Qiang Hu
Yuejun Guo
Maxime Cordy
Mike Papadakis
Zhenya Zhang
Yves Le Traon
Jianjun Zhao
31
8
0
06 Oct 2022
T5QL: Taming language models for SQL generation
T5QL: Taming language models for SQL generation
Samuel Arcadinho
David Oliveira Aparício
Hugo Veiga
António Alegria
26
6
0
21 Sep 2022
Statement-Level Vulnerability Detection: Learning Vulnerability Patterns
  Through Information Theory and Contrastive Learning
Statement-Level Vulnerability Detection: Learning Vulnerability Patterns Through Information Theory and Contrastive Learning
Van Nguyen
Trung Le
C. Tantithamthavorn
Michael Fu
John C. Grundy
Hung Nguyen
S. Çamtepe
Paul Quirk
Dinh Q. Phung
41
4
0
20 Sep 2022
Malicious Source Code Detection Using Transformer
Malicious Source Code Detection Using Transformer
Chen Tsfaty
Michael Fire
34
4
0
16 Sep 2022
Exploring Code Style Transfer with Neural Networks
Exploring Code Style Transfer with Neural Networks
Karl Munson
Anish Savla
Chih-Kai Ting
Serenity Wade
Kiran Kate
Kavitha Srinivas
CLIP
16
0
0
13 Sep 2022
Don't Complete It! Preventing Unhelpful Code Completion for Productive
  and Sustainable Neural Code Completion Systems
Don't Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems
Zhensu Sun
Xiaoning Du
Fu Song
Shangwen Wang
Mingze Ni
Li Li
29
10
0
13 Sep 2022
VulCurator: A Vulnerability-Fixing Commit Detector
VulCurator: A Vulnerability-Fixing Commit Detector
Truong-Giang Nguyen
Thanh Le-Cong
Hong Jin Kang
X. Le
David Lo
17
20
0
07 Sep 2022
AutoPruner: Transformer-Based Call Graph Pruning
AutoPruner: Transformer-Based Call Graph Pruning
Thanh Le-Cong
Hong Jin Kang
Truong-Giang Nguyen
S. A. Haryono
David Lo
X. Le
H. Thang
27
19
0
07 Sep 2022
Lost at C: A User Study on the Security Implications of Large Language
  Model Code Assistants
Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants
Gustavo Sandoval
Hammond Pearce
Teo Nys
Ramesh Karri
S. Garg
Brendan Dolan-Gavitt
ELM
27
90
0
20 Aug 2022
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural
  Code Generation
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
...
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
ELM
30
83
0
17 Aug 2022
Previous
1234567
Next