Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.17126
Cited By
Exploring RL-based LLM Training for Formal Language Tasks with Programmed Rewards
22 October 2024
Alexander Padula
Dennis J. N. J. Soemers
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring RL-based LLM Training for Formal Language Tasks with Programmed Rewards"
8 / 8 papers shown
Title
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Ziyang Luo
Can Xu
Pu Zhao
Qingfeng Sun
Xiubo Geng
Wenxiang Hu
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
ELM
SyDa
ALM
79
678
0
14 Jun 2023
Understanding plasticity in neural networks
Clare Lyle
Zeyu Zheng
Evgenii Nikishin
Bernardo Avila-Pires
Razvan Pascanu
Will Dabney
AI4CE
97
101
0
02 Mar 2023
Execution-based Code Generation using Deep Reinforcement Learning
Parshin Shojaee
Aneesh Jain
Sindhu Tipirneni
Chandan K. Reddy
74
57
0
31 Jan 2023
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELM
AIMat
ReCod
ALM
195
1,948
0
16 Aug 2021
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
230
7,498
0
02 Oct 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
460
1,727
0
18 Sep 2019
Ludii -- The Ludemic General Game System
Éric Piette
Dennis J. N. J. Soemers
Matthew Stephenson
C. F. Sironi
M. Winands
C. Browne
LLMAG
74
71
0
13 May 2019
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
478
19,019
0
20 Jul 2017
1