Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.01780
Cited By
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
5 July 2022
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
S. Hoi
SyDa
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning"
50 / 51 papers shown
Title
Self Rewarding Self Improving
Toby Simonds
Kevin Lopez
Akira Yoshiyama
Dominique Garmier
ReLM
ALM
LRM
45
0
0
12 May 2025
Synthetic Code Surgery: Repairing Bugs and Vulnerabilities with LLMs and Synthetic Data
David de-Fitero-Dominguez
Antonio Garcia-Cabot
Eva García-López
SyDa
71
0
0
12 May 2025
AgentXploit: End-to-End Redteaming of Black-Box AI Agents
Zhun Wang
Vincent Siu
Zhe Ye
Tianneng Shi
Yuzhou Nie
Xuandong Zhao
Chenguang Wang
Wenbo Guo
Dawn Song
LLMAG
AAML
36
0
0
09 May 2025
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks
Ilyas Oulkadda
Julien Perez
ALM
47
0
0
05 May 2025
Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL
Simone Papicchio
Simone Rossi
Luca Cagliero
Paolo Papotti
ReLM
LMTD
AI4TS
LRM
58
0
0
21 Apr 2025
Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
M. Wong
C. Tan
ALM
83
4
0
19 Mar 2025
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning
Yuan Jiang
Yujian Zhang
Liang Lu
Christoph Treude
Xiaohong Su
Shan Huang
Tiantian Wang
ALM
63
0
0
12 Mar 2025
Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection
Boyu Mi
Hanqing Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
74
0
0
21 Feb 2025
LeDex: Training LLMs to Better Self-Debug and Explain Code
Nan Jiang
Xiaopeng Li
Shiqi Wang
Qiang Zhou
Soneya Binta Hossain
Baishakhi Ray
Varun Kumar
Xiaofei Ma
Anoop Deoras
LRM
92
11
0
17 Feb 2025
Preference Optimization for Reasoning with Pseudo Feedback
Fangkai Jiao
Geyang Guo
Xingxing Zhang
Nancy F. Chen
Shafiq R. Joty
Furu Wei
LRM
99
9
0
17 Feb 2025
MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training
Xinxin You
Xien Liu
Qixin Sun
Huan Zhang
Kaiyin Zhou
Shaohui Liu
Guoping Hu
Shijin Wang
Si Liu
Ji Wu
85
0
0
13 Feb 2025
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
Yannis Flet-Berliac
Nathan Grinsztajn
Florian Strub
Bill Wu
Eugene Choi
...
Arash Ahmadian
Yash Chandak
M. G. Azar
Olivier Pietquin
Matthieu Geist
OffRL
64
5
0
17 Jan 2025
Planning-Driven Programming: A Large Language Model Programming Workflow
Chao Lei
Yanchuan Chang
N. Lipovetzky
Krista A. Ehinger
86
2
0
10 Jan 2025
FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization system
Zeyuan Li
Yangfan He
Lewei He
Jianhui Wang
Tianyu Shi
Bin Lei
Yuchen Li
Qiuwu Chen
ALM
67
5
0
28 Oct 2024
Process Supervision-Guided Policy Optimization for Code Generation
Ning Dai
Zheng Wu
Renjie Zheng
Ziyun Wei
Wenlei Shi
Xing Jin
Guanlin Liu
Chen Dun
Liang Huang
Lin Yan
54
8
0
23 Oct 2024
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
Mingjie Liu
Yun-Da Tsai
Wenfei Zhou
Haoxing Ren
SyDa
3DV
45
6
0
19 Sep 2024
Problem Solving Through Human-AI Preference-Based Cooperation
Subhabrata Dutta
Timo Kaufmann
Goran Glavas
Ivan Habernal
Kristian Kersting
Frauke Kreuter
Mira Mezini
Iryna Gurevych
Eyke Hüllermeier
Hinrich Schuetze
98
1
0
14 Aug 2024
Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation
Hui Ma
Bo Zhang
Bo Xu
Jian Wang
Hongfei Lin
Xiao Sun
57
1
0
06 Aug 2024
Is Programming by Example solved by LLMs?
Wen-Ding Li
Kevin Ellis
37
10
0
12 Jun 2024
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
Hao Tang
Keya Hu
Jin Peng Zhou
Sicheng Zhong
Wei-Long Zheng
Xujie Si
Kevin Ellis
42
13
0
26 May 2024
Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
Max Liu
Chan-Hung Yu
Wei-Hsu Lee
Cheng-Wei Hung
Yen-Chun Chen
Shao-Hua Sun
55
4
0
26 May 2024
Reinforcement Learning-Guided Semi-Supervised Learning
Marzi Heidari
Hanping Zhang
Yuhong Guo
OffRL
39
0
0
02 May 2024
Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Zaid Khan
B. Vijaykumar
S. Schulter
Yun Fu
Manmohan Chandraker
LRM
ReLM
34
6
0
06 Apr 2024
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation
Fang Liu
Yang Liu
Lin Shi
Houkun Huang
Ruifeng Wang
Zhen Yang
Li Zhang
Zhongqi Li
Yuchi Ma
52
108
0
01 Apr 2024
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
Ahmad A Mahmood
Ashmal Vayani
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
LRM
56
7
0
21 Mar 2024
CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation
Jueon Eom
Seyeon Jeong
Taekyoung Kwon
32
7
0
19 Feb 2024
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
Wenkai Yang
Xiaohan Bi
Yankai Lin
Sishuo Chen
Jie Zhou
Xu Sun
LLMAG
AAML
44
53
0
17 Feb 2024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi
Wenxiang Chen
Boyang Hong
Senjie Jin
Rui Zheng
...
Xinbo Zhang
Peng Sun
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
37
20
0
08 Feb 2024
LangProp: A code optimization framework using Large Language Models applied to driving
Shu Ishida
Gianluca Corrado
George Fedoseev
Hudson Yeo
Lloyd Russell
Jamie Shotton
João F. Henriques
Anthony Hu
59
11
0
18 Jan 2024
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules
Hung Le
Hailin Chen
Amrita Saha
Akash Gokul
Doyen Sahoo
Shafiq R. Joty
LRM
28
42
0
13 Oct 2023
Cognitive Architectures for Language Agents
T. Sumers
Shunyu Yao
Karthik Narasimhan
Thomas L. Griffiths
LLMAG
LM&Ro
54
153
0
05 Sep 2023
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
46
10
0
28 Aug 2023
A Lightweight Framework for High-Quality Code Generation
Mohammed Latif Siddiq
B.K. Casey
Joanna C. S. Santos
44
17
0
17 Jul 2023
Exploring Continual Learning for Code Generation Models
Prateek Yadav
Q. Sun
Hantian Ding
Xiaopeng Li
Dejiao Zhang
...
Parminder Bhatia
Ramesh Nallapati
M. K. Ramanathan
Joey Tianyi Zhou
Bing Xiang
CLL
37
30
0
05 Jul 2023
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
42
78
0
04 Jul 2023
Is Self-Repair a Silver Bullet for Code Generation?
Theo X. Olausson
J. Inala
Chenglong Wang
Jianfeng Gao
Armando Solar-Lezama
LRM
26
108
0
16 Jun 2023
Coarse-Tuning Models of Code with Reinforcement Learning Feedback
Abhinav C. P. Jain
Chima Adiole
Swarat Chaudhuri
Thomas W. Reps
Chris Jermaine Rice University
ALM
22
2
0
25 May 2023
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
32
4
0
22 May 2023
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLM
SyDa
31
53
0
22 May 2023
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation
Xinyu Li
Jiang-Tian Xue
Zheng Xie
Ming Li
LRM
19
26
0
18 May 2023
LeTI: Learning to Generate from Textual Interactions
Xingyao Wang
Hao Peng
Reyhaneh Jabbarvand
Heng Ji
35
30
0
17 May 2023
RunBugRun -- An Executable Dataset for Automated Program Repair
Julian Aron Prenner
Romain Robbes
38
11
0
03 Apr 2023
Greener yet Powerful: Taming Large Code Generation Models with Quantization
Xiaokai Wei
Sujan Kumar Gonugondla
W. Ahmad
Shiqi Wang
Baishakhi Ray
...
Ben Athiwaratkun
Mingyue Shang
M. K. Ramanathan
Parminder Bhatia
Bing Xiang
MQ
28
6
0
09 Mar 2023
Execution-based Code Generation using Deep Reinforcement Learning
Parshin Shojaee
Aneesh Jain
Sindhu Tipirneni
Chandan K. Reddy
25
52
0
31 Jan 2023
A Survey on Natural Language Processing for Programming
Qingfu Zhu
Xianzhen Luo
Fang Liu
Cuiyun Gao
Wanxiang Che
25
1
0
12 Dec 2022
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
Chen Gong
Zhou Yang
Yunru Bai
Junda He
Jieke Shi
...
Arunesh Sinha
Bowen Xu
Xinwen Hou
David Lo
Guoliang Fan
AAML
OffRL
21
7
0
07 Oct 2022
CodeT: Code Generation with Generated Tests
Bei Chen
Fengji Zhang
A. Nguyen
Daoguang Zan
Zeqi Lin
Jian-Guang Lou
Weizhu Chen
43
319
0
21 Jul 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
319
11,953
0
04 Mar 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
238
1,489
0
02 Sep 2021
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
208
624
0
20 May 2021
1
2
Next