Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.24760
Cited By
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
30 May 2025
Zafir Stojanovski
Oliver Stanley
Joe Sharratt
Richard Jones
Abdulhakeem Adefioye
Jean Kaddour
Andreas Kopf
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards"
25 / 25 papers shown
Title
Learning to Reason without External Rewards
Xuandong Zhao
Zhewei Kang
Aosong Feng
Sergey Levine
Dawn Song
OffRL
ReLM
LRM
32
2
0
26 May 2025
Sudoku-Bench: Evaluating creative reasoning with Sudoku variants
Jeffrey Seely
Yuki Imajuku
Tianyu Zhao
Edoardo Cetin
Llion Jones
LRM
41
1
0
22 May 2025
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
Liang Chen
Hongcheng Gao
Tianyu Liu
Zhiqi Huang
Flood Sung
Xinyu Zhou
Yuxin Wu
Baobao Chang
OffRL
LRM
VLM
10
1
0
19 May 2025
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment
Siliang Zeng
Quan Wei
William Brown
Oana Frunza
Yuriy Nevmyvaka
Mingyi Hong
LRM
51
2
0
17 May 2025
ARC-AGI-2: A New Challenge for Frontier AI Reasoning Systems
Francois Chollet
Mike Knoop
Gregory Kamradt
Bryan Landers
Henry Pinkard
LRM
AI4CE
9
1
0
17 May 2025
RAIDEN-R1: Improving Role-awareness of LLMs via GRPO with Verifiable Reward
Zongsheng Wang
Kaili Sun
Bowen Wu
Qun Yu
Ying Li
Baoxun Wang
21
1
0
15 May 2025
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang
Qing Yang
Zhiyuan Zeng
Liliang Ren
Liu Liu
...
Jianfeng Gao
Weizhu Chen
Shuaiqiang Wang
Simon Shaolei Du
Yelong Shen
OffRL
ReLM
LRM
190
23
0
29 Apr 2025
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Zihan Wang
Kaidi Wang
Q. Wang
Pingyue Zhang
Linjie Li
...
Jiajun Wu
L. Fei-Fei
Lijuan Wang
Yejin Choi
Manling Li
112
20
0
24 Apr 2025
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset
Ivan Moshkov
Darragh Hanley
Ivan Sorokin
Shubham Toshniwal
Christof Henkel
Benedikt Schifferer
Wei Du
Igor Gitman
ReLM
LRM
61
11
0
23 Apr 2025
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue
Zhiqi Chen
Rui Lu
Andrew Zhao
Zhaokai Wang
Yang Yue
Shiji Song
Gao Huang
ReLM
LRM
98
55
0
18 Apr 2025
AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models
Qin Zhu
Fei Huang
Runyu Peng
Keming Lu
Bowen Yu
Qinyuan Cheng
Xipeng Qiu
Xuanjing Huang
Junyang Lin
ReLM
ELM
LRM
65
3
0
24 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
168
1,503
0
22 Jan 2025
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Davide Paglieri
Bartłomiej Cupiał
Samuel Coward
Ulyana Piterbarg
Maciej Wolczyk
...
Lerrel Pinto
Rob Fergus
Jakob Foerster
Jack Parker-Holder
Tim Rocktaschel
LLMAG
LRM
155
16
0
20 Nov 2024
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo
Minh Chien Vu
Jenny Chim
Han Hu
Wenhao Yu
...
David Lo
Daniel Fried
Xiaoning Du
H. D. Vries
Leandro von Werra
84
158
0
22 Jun 2024
Are We Done with MMLU?
Aryo Pradipta Gema
Joshua Ong Jun Leang
Giwon Hong
Alessio Devoto
Alberto Carlo Maria Mancino
...
R. McHardy
Joshua Harris
Jean Kaddour
Emile van Krieken
Pasquale Minervini
ELM
92
39
0
06 Jun 2024
Momentum-based Weight Interpolation of Strong Zero-Shot Models for Continual Learning
Zafir Stojanovski
Karsten Roth
Zeynep Akata
33
17
0
06 Nov 2022
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
146
5,328
0
07 Jul 2021
Dynabench: Rethinking Benchmarking in NLP
Douwe Kiela
Max Bartolo
Yixin Nie
Divyansh Kaushik
Atticus Geiger
...
Pontus Stenetorp
Robin Jia
Joey Tianyi Zhou
Christopher Potts
Adina Williams
130
401
0
07 Apr 2021
Probabilistic Active Meta-Learning
Jean Kaddour
Steindór Sæmundsson
M. Deisenroth
49
35
0
17 Jul 2020
Meta-Learning in Neural Networks: A Survey
Timothy M. Hospedales
Antreas Antoniou
P. Micaelli
Amos Storkey
OOD
296
1,950
0
11 Apr 2020
Meta-Learning: A Survey
Joaquin Vanschoren
FedML
OOD
53
756
0
08 Oct 2018
Progress & Compress: A scalable framework for continual learning
Jonathan Richard Schwarz
Jelena Luketina
Wojciech M. Czarnecki
A. Grabska-Barwinska
Yee Whye Teh
Razvan Pascanu
R. Hadsell
CLL
95
877
0
16 May 2018
Dynamic Few-Shot Visual Learning without Forgetting
Spyros Gidaris
N. Komodakis
VLM
46
1,125
0
25 Apr 2018
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
757
11,793
0
09 Mar 2017
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
260
7,410
0
02 Dec 2016
1