Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.02900
Cited By
Off-Policy Deep Reinforcement Learning without Exploration
7 December 2018
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Off-Policy Deep Reinforcement Learning without Exploration"
50 / 415 papers shown
Title
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement Learning
Henrique Donancio
L. Vercouter
H. Roclawski
AI4CE
25
1
0
20 Oct 2022
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Chengqian Gao
Kelvin Xu
Liu Liu
Deheng Ye
P. Zhao
Zhiqiang Xu
OffRL
45
2
0
19 Oct 2022
Boosting Offline Reinforcement Learning via Data Rebalancing
Yang Yue
Bingyi Kang
Xiao Ma
Zhongwen Xu
Gao Huang
Shuicheng Yan
OffRL
31
22
0
17 Oct 2022
CUP: Critic-Guided Policy Reuse
Jin Zhang
Siyuan Li
Chongjie Zhang
42
8
0
15 Oct 2022
Learning Skills from Demonstrations: A Trend from Motion Primitives to Experience Abstraction
Mehrdad Tavassoli
S. Katyara
Maria Pozzi
Nikhil Deshpande
D. Caldwell
D. Prattichizzo
64
11
0
14 Oct 2022
Sustainable Online Reinforcement Learning for Auto-bidding
Zhiyu Mou
Yusen Huo
Rongquan Bai
Mingzhou Xie
Chuan Yu
Jian Xu
Bo Zheng
OffRL
OnRL
39
17
0
13 Oct 2022
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
Qinqing Zheng
Mikael Henaff
Brandon Amos
Aditya Grover
OffRL
31
20
0
12 Oct 2022
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems
Zhengbang Zhu
Rongjun Qin
Junjie Huang
Xinyi Dai
Yang Yu
Yong Yu
Weinan Zhang
51
2
0
11 Oct 2022
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Aviral Kumar
Anika Singh
F. Ebert
Mitsuhiko Nakamoto
Yanlai Yang
Chelsea Finn
Sergey Levine
OffRL
OnRL
136
67
0
11 Oct 2022
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement Learning
Tung Nguyen
Qinqing Zheng
Aditya Grover
OffRL
63
6
0
11 Oct 2022
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
Chen Gong
Zhou Yang
Yunru Bai
Junda He
Jieke Shi
...
Arunesh Sinha
Bowen Xu
Xinwen Hou
David Lo
Guoliang Fan
AAML
OffRL
31
8
0
07 Oct 2022
B2RL: An open-source Dataset for Building Batch Reinforcement Learning
Hsin-Yu Liu
Xiaohan Fu
Bharathan Balaji
Rajesh E. Gupta
Dezhi Hong
OffRL
32
4
0
30 Sep 2022
S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement Learning
Daesol Cho
D. Shim
H. J. Kim
OffRL
62
11
0
30 Sep 2022
Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling
Huayu Chen
Cheng Lu
Chengyang Ying
Hang Su
Jun Zhu
DiffM
OffRL
118
108
0
29 Sep 2022
Latent Plans for Task-Agnostic Offline Reinforcement Learning
Erick Rosete-Beas
Oier Mees
Gabriel Kalweit
Joschka Boedecker
Wolfram Burgard
OffRL
51
81
0
19 Sep 2022
Age of Semantics in Cooperative Communications: To Expedite Simulation Towards Real via Offline Reinforcement Learning
Xianfu Chen
Zhifeng Zhao
S. Mao
Celimuge Wu
Honggang Zhang
M. Bennis
OffRL
37
3
0
19 Sep 2022
On the Reuse Bias in Off-Policy Reinforcement Learning
Chengyang Ying
Zhongkai Hao
Xinning Zhou
Hang Su
Dong Yan
Jun Zhu
OffRL
45
3
0
15 Sep 2022
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation
Xiaoteng Ma
Zhipeng Liang
Jose H. Blanchet
MingWen Liu
Li Xia
Jiheng Zhang
Qianchuan Zhao
Zhengyuan Zhou
OOD
OffRL
46
23
0
14 Sep 2022
Statistical Estimation of Confounded Linear MDPs: An Instrumental Variable Approach
Miao Lu
Wenhao Yang
Liangyu Zhang
Zhihua Zhang
OffRL
45
1
0
12 Sep 2022
Task-Agnostic Learning to Accomplish New Tasks
Xianqi Zhang
Xingtao Wang
Xu Liu
Wenrui Wang
Xiaopeng Fan
Debin Zhao
OffRL
94
0
0
09 Sep 2022
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
Taku Yamagata
Ahmed Khalil
Raúl Santos-Rodríguez
OffRL
164
73
0
08 Sep 2022
Dialogue Evaluation with Offline Reinforcement Learning
Nurul Lubis
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Michael Heck
Shutong Feng
Milica Gavsić
OffRL
32
4
0
02 Sep 2022
Goal-Conditioned Q-Learning as Knowledge Distillation
Alexander Levine
Soheil Feizi
OffRL
45
2
0
28 Aug 2022
SupervisorBot: NLP-Annotated Real-Time Recommendations of Psychotherapy Treatment Strategies with Deep Reinforcement Learning
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
OffRL
34
12
0
27 Aug 2022
Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games
Rongjun Qin
Fan Luo
Hong Qian
Yang Yu
37
2
0
19 Aug 2022
Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems
Qihua Zhang
Junning Liu
Yuzhuo Dai
Yiyan Qi
Yifan Yuan
Kunlun Zheng
Fan Huang
Xianfeng Tan
OffRL
40
50
0
09 Aug 2022
Robot Policy Learning from Demonstration Using Advantage Weighting and Early Termination
A. Mohtasib
Gerhard Neumann
Heriberto Cuayáhuitl
OffRL
49
2
0
31 Jul 2022
Offline Reinforcement Learning at Multiple Frequencies
Kaylee Burns
Tianhe Yu
Chelsea Finn
Karol Hausman
OffRL
50
6
0
26 Jul 2022
Reinforcement Learning For Survival, A Clinically Motivated Method For Critically Ill Patients
Thesath Nanayakkara
OOD
OffRL
29
0
0
17 Jul 2022
BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion
Fanglin Chen
Xiao Liu
Bo Tang
Feiyu Xiong
Serim Hwang
Guomian Zhuang
OffRL
25
1
0
16 Jul 2022
Making Linear MDPs Practical via Contrastive Representation Learning
Tianjun Zhang
Tongzheng Ren
Mengjiao Yang
Joseph E. Gonzalez
Dale Schuurmans
Bo Dai
30
44
0
14 Jul 2022
Hindsight Learning for MDPs with Exogenous Inputs
Sean R. Sinclair
Felipe Vieira Frujeri
Ching-An Cheng
Luke Marshall
Hugo Barbalho
Jingling Li
Jennifer Neville
Ishai Menache
Adith Swaminathan
21
23
0
13 Jul 2022
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning
Homer Walke
Jonathan Yang
Albert Yu
Aviral Kumar
Jedrzej Orbik
Avi Singh
Sergey Levine
OffRL
OnRL
40
32
0
11 Jul 2022
Multi-objective Optimization of Notifications Using Offline Reinforcement Learning
Prakruthi Prabhakar
Yiping Yuan
Guangyu Yang
Wensheng Sun
A. Muralidharan
OffRL
33
6
0
07 Jul 2022
Offline RL Policies Should be Trained to be Adaptive
Dibya Ghosh
Anurag Ajay
Pulkit Agrawal
Sergey Levine
OffRL
42
46
0
05 Jul 2022
Offline Policy Optimization with Eligible Actions
Yao Liu
Yannis Flet-Berliac
Emma Brunskill
OffRL
38
5
0
01 Jul 2022
Modular Lifelong Reinforcement Learning via Neural Composition
Jorge Armando Mendez Mendez
H. V. Seijen
Eric Eaton
OffRL
KELM
CLL
88
38
0
01 Jul 2022
Watch and Match: Supercharging Imitation with Regularized Optimal Transport
Siddhant Haldar
Vaibhav Mathur
Denis Yarats
Lerrel Pinto
63
62
0
30 Jun 2022
A Survey on Model-based Reinforcement Learning
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRL
LRM
69
101
0
19 Jun 2022
SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments
Mohan Zhang
Xiaozhou Wang
Benjamin Decardi-Nelson
Bo Song
A. Zhang
...
Jiayi Cheng
Xiaohong Liu
DengDeng Yu
Matthew Poon
Animesh Garg
39
4
0
17 Jun 2022
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Yuanpei Chen
Tianhao Wu
Shengjie Wang
Xidong Feng
Jiechuan Jiang
...
Yiran Geng
Hao Dong
Zongqing Lu
Song-Chun Zhu
Yaodong Yang
OffRL
53
111
0
17 Jun 2022
Bootstrapped Transformer for Offline Reinforcement Learning
Kerong Wang
Hanye Zhao
Xufang Luo
Kan Ren
Weinan Zhang
Dongsheng Li
OffRL
21
38
0
17 Jun 2022
Relative Policy-Transition Optimization for Fast Policy Transfer
Jiawei Xu
Cheng Zhou
Yizheng Zhang
Zhengyou Zhang
Lei Han
30
0
0
13 Jun 2022
Federated Offline Reinforcement Learning
D. Zhou
Yufeng Zhang
Aaron Sonabend-W
Zhaoran Wang
Junwei Lu
Tianxi Cai
OffRL
40
13
0
11 Jun 2022
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
Cong Lu
Philip J. Ball
Tim G. J. Rudner
Jack Parker-Holder
Michael A. Osborne
Yee Whye Teh
OffRL
47
52
0
09 Jun 2022
On the Role of Discount Factor in Offline Reinforcement Learning
Haotian Hu
Yiqin Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
47
18
0
07 Jun 2022
Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
David Brandfonbrener
Rémi Tachet des Combes
Romain Laroche
OffRL
46
5
0
02 Jun 2022
ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Wanqi Xue
Qingpeng Cai
Ruohan Zhan
Dong Zheng
Peng Jiang
Kun Gai
Bo An
OffRL
43
24
0
01 Jun 2022
Non-Markovian policies occupancy measures
Romain Laroche
Rémi Tachet des Combes
Jacob Buckman
OffRL
44
1
0
27 May 2022
Towards Learning Universal Hyperparameter Optimizers with Transformers
Yutian Chen
Xingyou Song
Chansoo Lee
Zehao Wang
Qiuyi Zhang
...
Greg Kochanski
Arnaud Doucet
MarcÁurelio Ranzato
Sagi Perel
Nando de Freitas
43
63
0
26 May 2022
Previous
1
2
3
4
5
6
7
8
9
Next