Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.01540
Cited By
OpenAI Gym
5 June 2016
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"OpenAI Gym"
50 / 2,578 papers shown
Title
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Chengpeng Hu
Ziming Wang
Bo Yuan
Jialin Liu
Chengqi Zhang
Xin Yao
20
0
0
20 Jun 2025
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
Kosuke Nakanishi
Akihiro Kubo
Yuji Yasui
Shin Ishii
AAML
OffRL
17
0
0
20 Jun 2025
Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy
Amornyos Horprasert
Esa Apriaskar
Xingyu Liu
Lanlan Su
Lyudmila S. Mihaylova
22
0
0
16 Jun 2025
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning
Songze Li
Mingxuan Zhang
Kang Wei
Shouling Ji
AAML
90
0
0
11 Jun 2025
Wasserstein Barycenter Soft Actor-Critic
Zahra Shahrooei
Ali Baheri
OffRL
57
0
0
11 Jun 2025
When Maximum Entropy Misleads Policy Optimization
Ruipeng Zhang
Ya-Chien Chang
Sicun Gao
34
0
0
05 Jun 2025
Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning
S. E. Ada
Emre Ugur
BDL
56
1
0
04 Jun 2025
Bridging the Performance Gap Between Target-Free and Target-Based Reinforcement Learning With Iterated Q-Learning
Théo Vincent
Yogesh Tripathi
Tim Lukas Faust
Yaniv Oren
Jan Peters
Carlo DÉramo
CLL
34
0
0
04 Jun 2025
The Actor-Critic Update Order Matters for PPO in Federated Reinforcement Learning
Zhijie Xie
Shenghui Song
53
0
0
02 Jun 2025
RLAE: Reinforcement Learning-Assisted Ensemble for LLMs
Y. Fu
Yuanheng Zhu
Jiajun Chai
Guojun Yin
Wei Lin
Qichao Zhang
Dongbin Zhao
23
0
0
31 May 2025
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Hongyao Tang
J. Obando-Ceron
Pablo Samuel Castro
Aaron Courville
Glen Berseth
38
0
0
31 May 2025
Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning
Junaid Muzaffar
Ahsan Adeel
K. Ahmed
Ingo Frommholz
Zeeshan Pervez
26
0
0
31 May 2025
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu
Tong Bu
Zecheng Hao
Jianhao Ding
Zhaofei Yu
30
0
0
30 May 2025
A New Representation of Binary Sequences by means of Boolean Functions
S.D. Cardell
A. Fuúter-Sabater
V. Requena
M. Beltrá
22
0
0
30 May 2025
Enhanced DACER Algorithm with High Diffusion Efficiency
Yinuo Wang
Mining Tan
Wenjun Zou
Haotian Lin
Xujie Song
...
Guojian Zhan
Tianze Zhu
Shiqi Liu
Jingliang Duan
Shengbo Eben Li
DiffM
73
0
0
29 May 2025
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
Hossein Goli
Michael Gimelfarb
Nathan Samuel de Lara
Haruki Nishimura
Masha Itkina
Florian Shkurti
OffRL
40
0
0
27 May 2025
A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment
Brett Bissey
Kyle Gatesman
Walker Dimon
Mohammad Alam
Luis Robaina
Joseph Weissman
AAML
41
0
0
27 May 2025
Deep Actor-Critics with Tight Risk Certificates
Bahareh Tasdighi
Manuel Haussmann
Yi-Shan Wu
A. Masegosa
M. Kandemir
UQCV
88
0
0
26 May 2025
Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network
Bingdong Li
Mei Jiang
Hong Qian
K. Tang
W. Hong
Peng Yang
127
0
0
26 May 2025
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
Tao Wang
Ruipeng Zhang
Sicun Gao
OffRL
53
0
0
25 May 2025
lmgame-Bench: How Good are LLMs at Playing Games?
Lanxiang Hu
Mingjia Huo
Yu Zhang
Haoyang Yu
Eric P. Xing
Ion Stoica
Tajana Rosing
Haojian Jin
Hao Zhang
136
1
0
21 May 2025
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Runze Zhao
Yue Yu
Adams Yiyue Zhu
Chen Yang
Dongruo Zhou
48
0
0
20 May 2025
Counterfactual Explanations for Continuous Action Reinforcement Learning
Shuyang Dong
Shangtong Zhang
Lu Feng
OffRL
LRM
91
0
0
19 May 2025
Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning
Dongsu Lee
Minhae Kwon
OffRL
93
0
0
19 May 2025
Learning Probabilistic Temporal Logic Specifications for Stochastic Systems
Rajarshi Roy
Yash Pote
David Parker
Marta Kwiatkowska
57
0
0
17 May 2025
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP
Francesco Sovrano
150
2
0
16 May 2025
ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks
Feiran You
Hongyang Du
OffRL
LRM
96
0
0
16 May 2025
Visual Planning: Let's Think Only with Images
Yi Xu
Chengzu Li
Han Zhou
Xingchen Wan
Caiqi Zhang
Anna Korhonen
Ivan Vulić
LM&Ro
LRM
163
1
0
16 May 2025
Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps
Ningyuan Yang
Jiaxuan Gao
Feng Gao
Yi Wu
Chao Yu
152
0
0
15 May 2025
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
Jing-Cheng Pang
Kaiyuan Li
Yansen Wang
Si-Hang Yang
Shengyi Jiang
Yang Yu
OffRL
LLMAG
LM&Ro
LRM
65
0
0
15 May 2025
Diffusion-SAFE: Shared Autonomy Framework with Diffusion for Safe Human-to-Robot Driving Handover
Yunxin Fan
Monroe Kennedy III
49
0
0
15 May 2025
Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control
Hazim Alzorgan
Abolfazl Razi
58
0
0
13 May 2025
High-order Regularization for Machine Learning and Learning-based Control
Xinghua Liu
Ming Cao
52
0
0
13 May 2025
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
Matthew Sgambati
Aleksandar Vakanski
Matthew Anderson
45
0
0
06 May 2025
TutorGym: A Testbed for Evaluating AI Agents as Tutors and Students
Daniel Weitekamp
M. N. Siddiqui
Christopher MacLellan
LLMAG
ELM
75
0
0
02 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
Ming Yan
Fei Huang
Jingyi Wang
60
0
0
01 May 2025
Return Capping: Sample-Efficient CVaR Policy Gradient Optimisation
Harry Mead
Clarissa Costen
Bruno Lacerda
Nick Hawes
129
0
0
29 Apr 2025
DeeP-Mod: Deep Dynamic Programming based Environment Modelling using Feature Extraction
Chris Child
Lam Ngo
56
0
0
29 Apr 2025
Fitness Landscape of Large Language Model-Assisted Automated Algorithm Search
Fei Liu
Qingfu Zhang
Xialiang Tong
Mingxuan Yuan
K. Mao
141
0
0
28 Apr 2025
HyperController: A Hyperparameter Controller for Fast and Stable Training of Reinforcement Learning Neural Networks
J. Gornet
Yiannis Kantaros
Bruno Sinopoli
391
0
0
27 Apr 2025
Recursive Deep Inverse Reinforcement Learning
Paul Ghanem
Owen Howell
Michael Potter
Pau Closas
A. Ramezani
Deniz Erdogmus
Tales Imbiriba
68
0
0
17 Apr 2025
pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild
Jonas Myhre Schiøtt
Viktor Sebastian Petersen
Dimitrios P. Papadopoulos
VLM
129
0
0
16 Apr 2025
VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning
Xuyang Chen
Guojian Wang
Keyu Yan
Lin Zhao
OffRL
94
1
0
16 Apr 2025
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss
Ukjo Hwang
Songnam Hong
OffRL
76
0
0
14 Apr 2025
TRATSS: Transformer-Based Task Scheduling System for Autonomous Vehicles
Yazan Youssef
Paulo Ricardo Marques de Araujo
Aboelmagd Noureldin
Sidney Givigi
49
0
0
07 Apr 2025
Sim4EndoR: A Reinforcement Learning Centered Simulation Platform for Task Automation of Endovascular Robotics
Tianliang Yao
Madaoji Ban
Bo Lu
Zhiqiang Pei
Peng Qi
89
2
0
04 Apr 2025
A Constrained Multi-Agent Reinforcement Learning Approach to Autonomous Traffic Signal Control
Anirudh Satheesh
Keenan Powell
84
0
0
30 Mar 2025
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
Rajdeep Singh Hundal
Yan Xiao
Xiaochun Cao
Jin Song Dong
Manuel Rigger
134
0
0
28 Mar 2025
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization
Zhenyu Liang
Hao Li
Naiwei Yu
Kebin Sun
Ran Cheng
146
1
0
26 Mar 2025
Mining-Gym: A Configurable RL Benchmarking Environment for Truck Dispatch Scheduling
C. Banerjee
Kien Nguyen
Clinton Fookes
OffRL
95
0
0
24 Mar 2025
1
2
3
4
...
50
51
52
Next