Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.07492
Cited By
DOBF: A Deobfuscation Pre-Training Objective for Programming Languages
15 February 2021
Baptiste Roziere
Marie-Anne Lachaux
Marc Szafraniec
Guillaume Lample
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DOBF: A Deobfuscation Pre-Training Objective for Programming Languages"
50 / 62 papers shown
Title
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
Indraneil Paul
Haoyi Yang
Goran Glavas
Kristian Kersting
Iryna Gurevych
AAML
SyDa
41
0
0
27 Mar 2025
LLM-Aided Customizable Profiling of Code Data Based On Programming Language Concepts
Pankaj Thorat
Adnan Qidwai
Adrija Dhar
Aishwariya Chakraborty
Anand Eswaran
Hima Patel
Praveen Jayachandran
59
0
0
19 Mar 2025
Building A Coding Assistant via the Retrieval-Augmented Language Model
Xinze Li
Hanbin Wang
Zhenghao Liu
S. Yu
Shuo Wang
Yukun Yan
Yukai Fu
Yu Gu
Ge Yu
3DV
RALM
23
2
0
21 Oct 2024
TRANSAGENT: An LLM-Based Multi-Agent System for Code Translation
Zhiqiang Yuan
Weitong Chen
Hanlin Wang
Kai Yu
Xin Peng
Yiling Lou
LLMAG
28
8
0
30 Sep 2024
A Joint Learning Model with Variational Interaction for Multilingual Program Translation
Yali Du
Hui Sun
Ming Li
35
2
0
25 Aug 2024
COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis
Weiqing Yang
Hanbin Wang
Zhenghao Liu
Xinze Li
Yukun Yan
Shuo Wang
Yu Gu
Minghe Yu
Zhiyuan Liu
Ge Yu
50
2
0
09 Aug 2024
Scaling Automatic Extraction of Pseudocode
Levent Toksoz
Gang Tan
C. L. Giles
32
0
0
07 Jun 2024
Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns
Constantinos Patsakis
Fran Casino
Nikolaos Lykousas
44
12
0
30 Apr 2024
Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension
Mengnan Qi
Yufan Huang
Yongqiang Yao
Maoquan Wang
Bin Gu
Neel Sundaresan
43
2
0
13 Apr 2024
Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation
Marcos Macedo
Yuan Tian
F. Côgo
Bram Adams
38
12
0
25 Mar 2024
LLM4Decompile: Decompiling Binary Code with Large Language Models
Hanzhuo Tan
Qi Luo
Jing Li
Yuqun Zhang
SyDa
ELM
50
17
0
08 Mar 2024
Code Representation Learning At Scale
Dejiao Zhang
W. Ahmad
Ming Tan
Hantian Ding
Ramesh Nallapati
Dan Roth
Xiaofei Ma
Bing Xiang
OffRL
21
8
0
02 Feb 2024
Structured Code Representations Enable Data-Efficient Adaptation of Code Language Models
Mayank Agarwal
Yikang Shen
Bailin Wang
Yoon Kim
Jie Chen
37
5
0
19 Jan 2024
A Study on Training and Developing Large Language Models for Behavior Tree Generation
Fu Li
Xueying Wang
Bin Li
Yunlong Wu
Yanzhen Wang
Xiaodong Yi
14
4
0
16 Jan 2024
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
Linyuan Gong
Mostafa Elhoushi
Alvin Cheung
29
11
0
05 Jan 2024
Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit
Yao Wan
Yang He
Zhangqian Bi
Jianguo Zhang
Hongyu Zhang
Yulei Sui
Guandong Xu
Hai Jin
Philip S. Yu
30
20
0
30 Dec 2023
AdaCCD: Adaptive Semantic Contrasts Discovery Based Cross Lingual Adaptation for Code Clone Detection
Yangkai Du
Tengfei Ma
Lingfei Wu
Xuhong Zhang
Shouling Ji
27
3
0
13 Nov 2023
Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations
Zilu Tang
Mayank Agarwal
Alex Shypula
Bailin Wang
Derry Wijaya
Jie Chen
Yoon Kim
LRM
37
15
0
13 Nov 2023
Data Augmentation for Code Translation with Comparable Corpora and Multiple References
Yiqing Xie
Atharva Naik
Daniel Fried
Carolyn Rose
46
6
0
01 Nov 2023
LILO: Learning Interpretable Libraries by Compressing and Documenting Code
Gabriel Grand
L. Wong
Matthew Bowers
Theo X. Olausson
Muxin Liu
Joshua B. Tenenbaum
Jacob Andreas
16
21
0
30 Oct 2023
SUT: Active Defects Probing for Transcompiler Models
Mengnan Qi
Yufan Huang
Maoquan Wang
Yongqiang Yao
Zihan Liu
Bin Gu
Colin B. Clement
Neel Sundaresan
25
2
0
22 Oct 2023
Program Translation via Code Distillation
Yufan Huang
Mengnan Qi
Yongqiang Yao
Maoquan Wang
Bin Gu
Colin B. Clement
Neel Sundaresan
21
4
0
17 Oct 2023
Automatically Testing Functional Properties of Code Translation Models
Hasan Ferit Eniser
Valentin Wüstholz
M. Christakis
18
6
0
07 Sep 2023
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
63
1,898
0
24 Aug 2023
Large Language Models for Software Engineering: A Systematic Literature Review
Xinying Hou
Yanjie Zhao
Yue Liu
Zhou Yang
Kailong Wang
Li Li
Xiapu Luo
David Lo
John C. Grundy
Haoyu Wang
39
322
0
21 Aug 2023
Exploiting Code Symmetries for Learning Program Semantics
Kexin Pei
Weichen Li
Qirui Jin
Shuyang Liu
Scott Geng
Lorenzo Cavallaro
Junfeng Yang
Suman Jana
39
4
0
07 Aug 2023
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code
Nadezhda Chirkova
Sergey Troshin
21
8
0
01 Aug 2023
An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code
Max Hort
Anastasiia Grishina
Leon Moonen
18
2
0
05 Jul 2023
A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges
Morteza Zakeri-Nasrabadi
Saeed Parsa
Mohammad Ramezani
C. Roy
Masoud Ekhtiarzadeh
35
43
0
28 Jun 2023
CoTran: An LLM-based Code Translator using Reinforcement Learning with Feedback from Compiler and Symbolic Execution
Prithwish Jana
Piyush Jha
Haoyang Ju
Gautham Kishore
Aryan Mahajan
Vijay Ganesh
24
12
0
11 Jun 2023
A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text
Jessica Nayeli López Espejel
Mahaman Sanoussi Yahaya Alassan
El Mehdi Chouham
Walid Dahhane
E. Ettifouri
21
13
0
10 Jun 2023
Coarse-Tuning Models of Code with Reinforcement Learning Feedback
Abhinav C. P. Jain
Chima Adiole
Swarat Chaudhuri
Thomas W. Reps
Chris Jermaine Rice University
ALM
19
2
0
25 May 2023
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
32
4
0
22 May 2023
Make Prompt-based Black-Box Tuning Colorful: Boosting Model Generalization from Three Orthogonal Perspectives
Qiushi Sun
Chengcheng Han
Nuo Chen
Renyu Zhu
Jing Gong
Xiang Li
Ming Gao
VLM
27
8
0
14 May 2023
ADELT: Transpilation Between Deep Learning Frameworks
Linyuan Gong
Jiayi Wang
Alvin Cheung
32
3
0
07 Mar 2023
Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation
Jiamou Sun
Zhenchang Xing
Qinghua Lu
Xiwei Xu
Liming Zhu
Thong Hoang
Dehai Zhao
17
12
0
15 Feb 2023
Syntax and Domain Aware Model for Unsupervised Program Translation
Fang Liu
Jia Li
Li Zhang
25
18
0
08 Feb 2023
Measuring The Impact Of Programming Language Distribution
Gabriel Orlanski
Kefan Xiao
Xavier Garcia
Jeffrey Hui
Joshua Howland
J. Malmaud
Jacob Austin
Rishah Singh
Michele Catasta
30
28
0
03 Feb 2023
SantaCoder: don't reach for the stars!
Loubna Ben Allal
Raymond Li
Denis Kocetkov
Chenghao Mou
Christopher Akiki
...
Sean M. Hughes
Daniel Fried
Arjun Guha
H. D. Vries
Leandro von Werra
39
189
0
09 Jan 2023
A Survey on Pretrained Language Models for Neural Code Intelligence
Yichen Xu
Yanqiao Zhu
4
17
0
20 Dec 2022
Parameter-Efficient Finetuning of Transformers for Source Code
Shamil Ayupov
Nadezhda Chirkova
22
17
0
12 Dec 2022
The Stack: 3 TB of permissively licensed source code
Denis Kocetkov
Raymond Li
Loubna Ben Allal
Jia Li
Chenghao Mou
...
Sean M. Hughes
Thomas Wolf
Dzmitry Bahdanau
Leandro von Werra
H. D. Vries
58
307
0
20 Nov 2022
Evaluating How Fine-tuning on Bimodal Data Effects Code Generation
Gabriel Orlanski
Seonhye Yang
Michael Healy
ALM
21
5
0
15 Nov 2022
Efficient Training of Language Models to Fill in the Middle
Mohammad Bavarian
Heewoo Jun
Nikolas Tezak
John Schulman
C. McLeavey
Jerry Tworek
Mark Chen
4
178
0
28 Jul 2022
NatGen: Generative pre-training by "Naturalizing" source code
Saikat Chakraborty
Toufique Ahmed
Yangruibo Ding
Prem Devanbu
Baishakhi Ray
AI4CE
55
116
0
15 Jun 2022
StructCoder: Structure-Aware Transformer for Code Generation
Sindhu Tipirneni
Ming Zhu
Chandan K. Reddy
30
55
0
10 Jun 2022
CodeAttack: Code-Based Adversarial Attacks for Pre-trained Programming Language Models
Akshita Jha
Chandan K. Reddy
SILM
ELM
AAML
27
58
0
31 May 2022
VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection
Hazim Hanif
S. Maffeis
58
95
0
25 May 2022
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code
Changan Niu
Chuanyi Li
Bin Luo
Vincent Ng
SyDa
VLM
47
48
0
24 May 2022
Summarize and Generate to Back-translate: Unsupervised Translation of Programming Languages
Wasi Uddin Ahmad
Saikat Chakraborty
Baishakhi Ray
Kai-Wei Chang
44
27
0
23 May 2022
1
2
Next