Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.02900
Cited By
Off-Policy Deep Reinforcement Learning without Exploration
7 December 2018
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Off-Policy Deep Reinforcement Learning without Exploration"
50 / 412 papers shown
Title
FlowQ: Energy-Guided Flow Policies for Offline Reinforcement Learning
Marvin Alles
Nutan Chen
Patrick van der Smagt
Botond Cseke
39
0
0
20 May 2025
Imagination-Limited Q-Learning for Offline Reinforcement Learning
Wenhui Liu
Zhijian Wu
Jingchao Wang
Dingjiang Huang
Shuigeng Zhou
OffRL
55
0
0
18 May 2025
Q-Policy: Quantum-Enhanced Policy Evaluation for Scalable Reinforcement Learning
Kalyan Cherukuri
Aarav Lala
Yash Yardi
27
0
0
17 May 2025
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
54
0
0
16 May 2025
Offline Reinforcement Learning for Microgrid Voltage Regulation
Shan Yang
Yongli Zhu
OffRL
48
0
0
15 May 2025
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
Jing-Cheng Pang
Kaiyuan Li
Yansen Wang
Si-Hang Yang
Shengyi Jiang
Yang Yu
OffRL
LLMAG
LM&Ro
LRM
26
0
0
15 May 2025
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Xinyue Wang
Zhen Zhang
OffRL
CML
39
0
0
13 May 2025
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
Yi Zhang
Ruihong Qiu
Xuwei Xu
Jiajun Liu
Sen Wang
OffRL
43
0
0
12 May 2025
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Minting Pan
Yitao Zheng
Jiajian Li
Yunbo Wang
Xiaokang Yang
OffRL
75
0
0
10 May 2025
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu
Sili Huang
Zhiyong Yang
Shengchao Hu
Li Shen
Hechang Chen
Lichao Sun
Yi-Ju Chang
Dacheng Tao
OffRL
292
0
0
03 May 2025
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
44
0
0
02 May 2025
Learning Neural Control Barrier Functions from Offline Data with Conservatism
Ihab Tabbara
Hussein Sibai
OffRL
71
0
0
01 May 2025
Offline Learning of Controllable Diverse Behaviors
Mathieu Petitbois
Rémy Portelas
Sylvain Lamprier
Ludovic Denoyer
OffRL
41
0
0
25 Apr 2025
Generative Auto-Bidding with Value-Guided Explorations
Jingtong Gao
Yewen Li
Shuai Mao
Peng Jiang
Nan Jiang
...
Fei Pan
Peng Jiang
Kun Gai
Bo An
Xiangyu Zhao
OffRL
67
0
0
20 Apr 2025
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
Younghwan Lee
Tung M. Luu
Donghoon Lee
Chang D. Yoo
3DV
VLM
OffRL
56
0
0
03 Apr 2025
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Chan Kim
Seung-Woo Seo
Seong-Woo Kim
OODD
318
0
0
21 Mar 2025
Mitigating Preference Hacking in Policy Optimization with Pessimism
Dhawal Gupta
Adam Fisch
Christoph Dann
Alekh Agarwal
78
0
0
10 Mar 2025
Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning
Zhenghai Xue
Lang Feng
Jiacheng Xu
Kang Kang
Xiang Wen
Jingyi Wang
Shuicheng Yan
OffRL
58
0
0
10 Mar 2025
DPR: Diffusion Preference-based Reward for Offline Reinforcement Learning
Teng Pang
Bingzheng Wang
Guoqiang Wu
Yilong Yin
OffRL
86
0
0
03 Mar 2025
Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov
Alexander Nikulin
Ilya Zisman
Albina Klepach
Andrei Polubarov
Nikita Lyubaykin
Alexander Derevyagin
Igor Kiselev
Vladislav Kurenkov
OffRL
OnRL
287
1
0
24 Feb 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Lixing Lyu
Jiashuo Jiang
Wang Chi Cheung
49
1
0
24 Feb 2025
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Xianyuan Zhan
Xiangyu Zhu
Peng Cheng
Xiao Hu
Ziteng He
...
Chenhui Liu
Tianshun Hong
Huiwen Zheng
Yunxin Liu
Feng Zhao
AI4CE
84
0
0
17 Feb 2025
Zero-shot Model-based Reinforcement Learning using Large Language Models
Abdelhakim Benechehab
Youssef Attia El Hili
Ambroise Odonnat
Oussama Zekri
Albert Thomas
Giuseppe Paolo
Maurizio Filippone
I. Redko
Balázs Kégl
OffRL
79
1
0
17 Feb 2025
Learning Strategy Representation for Imitation Learning in Multi-Agent Games
Shiqi Lei
Kanghon Lee
Linjing Li
Jinkyoo Park
OffRL
54
0
0
17 Feb 2025
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Shenghong He
OffRL
327
0
0
10 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
134
5
0
06 Feb 2025
Dual Alignment Maximin Optimization for Offline Model-based RL
Chi Zhou
Wang Luo
Haoran Li
Congying Han
Tiande Guo
Zicheng Zhang
OffRL
88
0
0
02 Feb 2025
B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning
Woojun Kim
Katia Sycara
OffRL
96
0
0
30 Jan 2025
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Zijian Guo
Weichao Zhou
Wenchao Li
OffRL
107
2
0
28 Jan 2025
TEA: Trajectory Encoding Augmentation for Robust and Transferable Policies in Offline Reinforcement Learning
Batıkan Bora Ormancı
Phillip Swazinna
Steffen Udluft
Thomas Runkler
OffRL
96
0
0
28 Jan 2025
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Yulong Hu
Tingting Dong
Sen Li
OffRL
OnRL
76
0
0
24 Jan 2025
An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management
Eslam Eldeeb
Hirley Alves
OffRL
92
0
0
22 Jan 2025
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
Abdullah Akgul
Manuel Haußmann
M. Kandemir
OffRL
89
0
0
17 Jan 2025
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Wall Kim
Mamba
73
0
0
10 Jan 2025
SR-Reward: Taking The Path More Traveled
Seyed Mahdi Basiri Azad
Zahra Padar
Gabriel Kalweit
Joschka Boedecker
OffRL
82
0
0
04 Jan 2025
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
Yooseok Lim
Sujee Lee
OffRL
171
0
0
03 Jan 2025
MADiff: Offline Multi-agent Learning with Diffusion Models
Zhengbang Zhu
Minghuan Liu
Liyuan Mao
Bingyi Kang
Minkai Xu
Yong Yu
Stefano Ermon
Weinan Zhang
DiffM
OffRL
93
35
0
03 Jan 2025
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Keru Chen
Honghao Wei
Zhigang Deng
Sen Lin
OffRL
OnRL
110
0
0
31 Dec 2024
Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey
Junqiao Wang
Zeng Zhang
Yangfan He
Yuyang Song
Tianyu Shi
...
Menghao Huo
Guangwu Qian
Keqin Li
Qiuwu Chen
Lewei He
67
11
0
29 Dec 2024
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
Kun Wu
Yinuo Zhao
Zhihao Xu
Zhengping Che
Chengxiang Yin
C. Liu
Qinru Qiu
Feiferi Feng
OffRL
114
1
0
22 Dec 2024
Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
Emile Anand
Ishani Karmarkar
Guannan Qu
90
1
0
01 Dec 2024
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Marvin Alles
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
56
1
0
07 Nov 2024
FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization system
Zeyuan Li
Yangfan He
Lewei He
Jianhui Wang
Tianyu Shi
Bin Lei
Tianyu Shi
Qiuwu Chen
ALM
96
5
0
28 Oct 2024
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Jing Zhang
Linjiajie Fang
Kexin Shi
Wenjia Wang
Bing-Yi Jing
OffRL
60
0
0
27 Oct 2024
Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces
Jifeng Hu
Sili Huang
Li Shen
Zhejian Yang
Shengchao Hu
Shisong Tang
Hechang Chen
Yi Chang
Dacheng Tao
Lichao Sun
OffRL
57
0
0
21 Oct 2024
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRL
OnRL
67
0
0
19 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
83
15
0
17 Oct 2024
Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach
Henrique Donâncio
Antoine Barrier
Leah F. South
Florence Forbes
33
0
0
16 Oct 2024
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Zhi Wang
Li Zhang
Wenhao Wu
Yuanheng Zhu
Dongbin Zhao
C. L. Philip Chen
OffRL
59
6
0
15 Oct 2024
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation
Jaehyun Park
Yunho Kim
Sejin Kim
Byung-Jun Lee
Sundong Kim
OffRL
52
1
0
15 Oct 2024
1
2
3
4
5
6
7
8
9
Next