ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.01540
  4. Cited By
OpenAI Gym

OpenAI Gym

5 June 2016
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
    OffRLODL
ArXiv (abs)PDFHTML

Papers citing "OpenAI Gym"

50 / 2,578 papers shown
Title
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Chengpeng Hu
Ziming Wang
Bo Yuan
Jialin Liu
Chengqi Zhang
Xin Yao
20
0
0
20 Jun 2025
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
Kosuke Nakanishi
Akihiro Kubo
Yuji Yasui
Shin Ishii
AAMLOffRL
17
0
0
20 Jun 2025
Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy
Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy
Amornyos Horprasert
Esa Apriaskar
Xingyu Liu
Lanlan Su
Lyudmila S. Mihaylova
22
0
0
16 Jun 2025
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning
Songze Li
Mingxuan Zhang
Kang Wei
Shouling Ji
AAML
90
0
0
11 Jun 2025
Wasserstein Barycenter Soft Actor-Critic
Wasserstein Barycenter Soft Actor-Critic
Zahra Shahrooei
Ali Baheri
OffRL
57
0
0
11 Jun 2025
When Maximum Entropy Misleads Policy Optimization
When Maximum Entropy Misleads Policy Optimization
Ruipeng Zhang
Ya-Chien Chang
Sicun Gao
34
0
0
05 Jun 2025
Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning
S. E. Ada
Emre Ugur
BDL
56
1
0
04 Jun 2025
Bridging the Performance Gap Between Target-Free and Target-Based Reinforcement Learning With Iterated Q-Learning
Théo Vincent
Yogesh Tripathi
Tim Lukas Faust
Yaniv Oren
Jan Peters
Carlo DÉramo
CLL
34
0
0
04 Jun 2025
The Actor-Critic Update Order Matters for PPO in Federated Reinforcement Learning
The Actor-Critic Update Order Matters for PPO in Federated Reinforcement Learning
Zhijie Xie
Shenghui Song
53
0
0
02 Jun 2025
RLAE: Reinforcement Learning-Assisted Ensemble for LLMs
RLAE: Reinforcement Learning-Assisted Ensemble for LLMs
Y. Fu
Yuanheng Zhu
Jiajun Chai
Guojun Yin
Wei Lin
Qichao Zhang
Dongbin Zhao
23
0
0
31 May 2025
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Hongyao Tang
J. Obando-Ceron
Pablo Samuel Castro
Aaron Courville
Glen Berseth
38
0
0
31 May 2025
Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning
Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning
Junaid Muzaffar
Ahsan Adeel
K. Ahmed
Ingo Frommholz
Zeeshan Pervez
26
0
0
31 May 2025
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu
Tong Bu
Zecheng Hao
Jianhao Ding
Zhaofei Yu
30
0
0
30 May 2025
A New Representation of Binary Sequences by means of Boolean Functions
A New Representation of Binary Sequences by means of Boolean Functions
S.D. Cardell
A. Fuúter-Sabater
V. Requena
M. Beltrá
22
0
0
30 May 2025
Enhanced DACER Algorithm with High Diffusion Efficiency
Enhanced DACER Algorithm with High Diffusion Efficiency
Yinuo Wang
Mining Tan
Wenjun Zou
Haotian Lin
Xujie Song
...
Guojian Zhan
Tianze Zhu
Shiqi Liu
Jingliang Duan
Shengbo Eben Li
DiffM
73
0
0
29 May 2025
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
Hossein Goli
Michael Gimelfarb
Nathan Samuel de Lara
Haruki Nishimura
Masha Itkina
Florian Shkurti
OffRL
40
0
0
27 May 2025
A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment
A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment
Brett Bissey
Kyle Gatesman
Walker Dimon
Mohammad Alam
Luis Robaina
Joseph Weissman
AAML
41
0
0
27 May 2025
Deep Actor-Critics with Tight Risk Certificates
Deep Actor-Critics with Tight Risk Certificates
Bahareh Tasdighi
Manuel Haussmann
Yi-Shan Wu
A. Masegosa
M. Kandemir
UQCV
88
0
0
26 May 2025
Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network
Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network
Bingdong Li
Mei Jiang
Hong Qian
K. Tang
W. Hong
Peng Yang
127
0
0
26 May 2025
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
Tao Wang
Ruipeng Zhang
Sicun Gao
OffRL
53
0
0
25 May 2025
lmgame-Bench: How Good are LLMs at Playing Games?
lmgame-Bench: How Good are LLMs at Playing Games?
Lanxiang Hu
Mingjia Huo
Yu Zhang
Haoyang Yu
Eric P. Xing
Ion Stoica
Tajana Rosing
Haojian Jin
Hao Zhang
136
1
0
21 May 2025
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Runze Zhao
Yue Yu
Adams Yiyue Zhu
Chen Yang
Dongruo Zhou
48
0
0
20 May 2025
Counterfactual Explanations for Continuous Action Reinforcement Learning
Counterfactual Explanations for Continuous Action Reinforcement Learning
Shuyang Dong
Shangtong Zhang
Lu Feng
OffRLLRM
91
0
0
19 May 2025
Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning
Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning
Dongsu Lee
Minhae Kwon
OffRL
93
0
0
19 May 2025
Learning Probabilistic Temporal Logic Specifications for Stochastic Systems
Learning Probabilistic Temporal Logic Specifications for Stochastic Systems
Rajarshi Roy
Yash Pote
David Parker
Marta Kwiatkowska
57
0
0
17 May 2025
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP
Francesco Sovrano
150
2
0
16 May 2025
ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks
ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks
Feiran You
Hongyang Du
OffRLLRM
96
0
0
16 May 2025
Visual Planning: Let's Think Only with Images
Visual Planning: Let's Think Only with Images
Yi Xu
Chengzu Li
Han Zhou
Xingchen Wan
Caiqi Zhang
Anna Korhonen
Ivan Vulić
LM&RoLRM
163
1
0
16 May 2025
Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps
Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps
Ningyuan Yang
Jiaxuan Gao
Feng Gao
Yi Wu
Chao Yu
152
0
0
15 May 2025
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
Jing-Cheng Pang
Kaiyuan Li
Yansen Wang
Si-Hang Yang
Shengyi Jiang
Yang Yu
OffRLLLMAGLM&RoLRM
65
0
0
15 May 2025
Diffusion-SAFE: Shared Autonomy Framework with Diffusion for Safe Human-to-Robot Driving Handover
Yunxin Fan
Monroe Kennedy III
49
0
0
15 May 2025
Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control
Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control
Hazim Alzorgan
Abolfazl Razi
58
0
0
13 May 2025
High-order Regularization for Machine Learning and Learning-based Control
High-order Regularization for Machine Learning and Learning-based Control
Xinghua Liu
Ming Cao
52
0
0
13 May 2025
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
Matthew Sgambati
Aleksandar Vakanski
Matthew Anderson
45
0
0
06 May 2025
TutorGym: A Testbed for Evaluating AI Agents as Tutors and Students
TutorGym: A Testbed for Evaluating AI Agents as Tutors and Students
Daniel Weitekamp
M. N. Siddiqui
Christopher MacLellan
LLMAGELM
75
0
0
02 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
Ming Yan
Fei Huang
Jingyi Wang
60
0
0
01 May 2025
Return Capping: Sample-Efficient CVaR Policy Gradient Optimisation
Return Capping: Sample-Efficient CVaR Policy Gradient Optimisation
Harry Mead
Clarissa Costen
Bruno Lacerda
Nick Hawes
129
0
0
29 Apr 2025
DeeP-Mod: Deep Dynamic Programming based Environment Modelling using Feature Extraction
DeeP-Mod: Deep Dynamic Programming based Environment Modelling using Feature Extraction
Chris Child
Lam Ngo
56
0
0
29 Apr 2025
Fitness Landscape of Large Language Model-Assisted Automated Algorithm Search
Fitness Landscape of Large Language Model-Assisted Automated Algorithm Search
Fei Liu
Qingfu Zhang
Xialiang Tong
Mingxuan Yuan
K. Mao
141
0
0
28 Apr 2025
HyperController: A Hyperparameter Controller for Fast and Stable Training of Reinforcement Learning Neural Networks
HyperController: A Hyperparameter Controller for Fast and Stable Training of Reinforcement Learning Neural Networks
J. Gornet
Yiannis Kantaros
Bruno Sinopoli
391
0
0
27 Apr 2025
Recursive Deep Inverse Reinforcement Learning
Recursive Deep Inverse Reinforcement Learning
Paul Ghanem
Owen Howell
Michael Potter
Pau Closas
A. Ramezani
Deniz Erdogmus
Tales Imbiriba
68
0
0
17 Apr 2025
pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild
pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild
Jonas Myhre Schiøtt
Viktor Sebastian Petersen
Dimitrios P. Papadopoulos
VLM
129
0
0
16 Apr 2025
VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning
VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning
Xuyang Chen
Guojian Wang
Keyu Yan
Lin Zhao
OffRL
94
1
0
16 Apr 2025
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss
Ukjo Hwang
Songnam Hong
OffRL
76
0
0
14 Apr 2025
TRATSS: Transformer-Based Task Scheduling System for Autonomous Vehicles
TRATSS: Transformer-Based Task Scheduling System for Autonomous Vehicles
Yazan Youssef
Paulo Ricardo Marques de Araujo
Aboelmagd Noureldin
Sidney Givigi
49
0
0
07 Apr 2025
Sim4EndoR: A Reinforcement Learning Centered Simulation Platform for Task Automation of Endovascular Robotics
Sim4EndoR: A Reinforcement Learning Centered Simulation Platform for Task Automation of Endovascular Robotics
Tianliang Yao
Madaoji Ban
Bo Lu
Zhiqiang Pei
Peng Qi
89
2
0
04 Apr 2025
A Constrained Multi-Agent Reinforcement Learning Approach to Autonomous Traffic Signal Control
A Constrained Multi-Agent Reinforcement Learning Approach to Autonomous Traffic Signal Control
Anirudh Satheesh
Keenan Powell
84
0
0
30 Mar 2025
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
Rajdeep Singh Hundal
Yan Xiao
Xiaochun Cao
Jin Song Dong
Manuel Rigger
134
0
0
28 Mar 2025
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization
Zhenyu Liang
Hao Li
Naiwei Yu
Kebin Sun
Ran Cheng
146
1
0
26 Mar 2025
Mining-Gym: A Configurable RL Benchmarking Environment for Truck Dispatch Scheduling
Mining-Gym: A Configurable RL Benchmarking Environment for Truck Dispatch Scheduling
C. Banerjee
Kien Nguyen
Clinton Fookes
OffRL
95
0
0
24 Mar 2025
1234...505152
Next