Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.06860
Cited By
A Minimalist Approach to Offline Reinforcement Learning
12 June 2021
Scott Fujimoto
S. Gu
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Minimalist Approach to Offline Reinforcement Learning"
50 / 522 papers shown
Title
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRL
OnRL
15
21
0
14 Mar 2023
Synthetic Experience Replay
Cong Lu
Philip J. Ball
Yee Whye Teh
Jack Parker-Holder
OffRL
94
67
0
12 Mar 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Mitsuhiko Nakamoto
Yuexiang Zhai
Anika Singh
Max Sobol Mark
Yi Ma
Chelsea Finn
Aviral Kumar
Sergey Levine
OffRL
OnRL
112
108
0
09 Mar 2023
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning
Pengqin Wang
Meixin Zhu
Shaojie Shen
OffRL
27
1
0
07 Mar 2023
Graph Decision Transformer
Shengchao Hu
Li Shen
Ya-Qin Zhang
Dacheng Tao
OffRL
30
15
0
07 Mar 2023
Decision Transformer under Random Frame Dropping
Kaizhe Hu
Rachel Zheng
Yang Gao
Huazhe Xu
OffRL
126
12
0
03 Mar 2023
The In-Sample Softmax for Offline Reinforcement Learning
Chenjun Xiao
Han Wang
Yangchen Pan
Adam White
Martha White
OffRL
24
26
0
28 Feb 2023
The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning
Haotian Hu
Yiqin Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
11
5
0
27 Feb 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
21
0
0
25 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
46
5
0
24 Feb 2023
Neural Laplace Control for Continuous-time Delayed Systems
Samuel Holt
Alihan Huyuk
Zhaozhi Qian
Hao Sun
M. Schaar
OffRL
26
10
0
24 Feb 2023
Behavior Proximal Policy Optimization
Zifeng Zhuang
Kun Lei
Jinxin Liu
Donglin Wang
Yilang Guo
OffRL
27
34
0
22 Feb 2023
Adversarial Model for Offline Reinforcement Learning
M. Bhardwaj
Tengyang Xie
Byron Boots
Nan Jiang
Ching-An Cheng
AAML
OffRL
32
25
0
21 Feb 2023
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems
Yihao Feng
Shentao Yang
Shujian Zhang
Jianguo Zhang
Caiming Xiong
Mi Zhou
Haiquan Wang
OffRL
28
24
0
20 Feb 2023
Swapped goal-conditioned offline reinforcement learning
Wenyan Yang
Huiling Wang
Dingding Cai
Joni Pajarinen
Joni-Kristen Kämäräinen
OffRL
OnRL
30
1
0
17 Feb 2023
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
Harshit S. Sikchi
Qinqing Zheng
Amy Zhang
S. Niekum
OffRL
30
19
0
16 Feb 2023
Prioritized offline Goal-swapping Experience Replay
Wenyan Yang
Joni Pajarinen
Dinging Cai
Joni Kämäräinen
OffRL
OnRL
26
0
0
15 Feb 2023
Constrained Decision Transformer for Offline Safe Reinforcement Learning
Zuxin Liu
Zijian Guo
Yi-Fan Yao
Zhepeng Cen
Wenhao Yu
Tingnan Zhang
Ding Zhao
OffRL
31
46
0
14 Feb 2023
Conservative State Value Estimation for Offline Reinforcement Learning
Liting Chen
Jie Yan
Zhengdao Shao
Lu Wang
Qingwei Lin
Saravan Rajmohan
Thomas Moscibroda
Dongmei Zhang
OffRL
18
5
0
14 Feb 2023
A Strong Baseline for Batch Imitation Learning
Matthew Smith
Lucas Maystre
Zhenwen Dai
K. Ciosek
OffRL
17
4
0
06 Feb 2023
Reinforcing User Retention in a Billion Scale Short Video Recommender System
Qingpeng Cai
Shuchang Liu
Xueliang Wang
Tianyou Zuo
Wentao Xie
Bin Yang
Dong Zheng
Peng Jiang
Kun Gai
OffRL
22
41
0
03 Feb 2023
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Qing-Shan Jia
Ya-Qin Zhang
OffRL
38
19
0
03 Feb 2023
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Haichao Zhang
Weiwen Xu
Haonan Yu
CLL
OffRL
OnRL
40
62
0
02 Feb 2023
Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning
Claude Formanek
Asad Jeewa
Jonathan P. Shock
Arnu Pretorius
OffRL
40
1
0
01 Feb 2023
Anti-Exploration by Random Network Distillation
Alexander Nikulin
Vladislav Kurenkov
Denis Tarasov
Sergey Kolesnikov
38
24
0
31 Jan 2023
Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies
Qiang-qiang Wang
Robert McCarthy
David Córdova Bulens
Francisco Roldan Sanchez
Kevin McGuinness
Noel E. O'Connor
S. Redmond
OffRL
25
3
0
30 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
38
15
0
30 Jan 2023
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
Jing Zhang
Chi Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
29
7
0
28 Jan 2023
Improving Behavioural Cloning with Positive Unlabeled Learning
Qiang-qiang Wang
Robert McCarthy
David Córdova Bulens
Kevin McGuinness
Noel E. O'Connor
Nico Gürtler
Felix Widmaier
Francisco Roldan Sanchez
S. Redmond
OffRL
OnRL
24
8
0
27 Jan 2023
Plan To Predict: Learning an Uncertainty-Foreseeing Model for Model-Based Reinforcement Learning
Zifan Wu
Chao Yu
Cheng Chen
Jianye Hao
H. Zhuo
11
16
0
20 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg
Joey Hejna
M. Geist
Stefano Ermon
OffRL
33
63
0
05 Jan 2023
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Yiren Lu
Justin Fu
George Tucker
Xinlei Pan
Eli Bronstein
...
Brandyn White
Aleksandra Faust
Shimon Whiteson
Drago Anguelov
Sergey Levine
OffRL
26
92
0
21 Dec 2022
Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies
Shivakanth Sujit
Pedro H. M. Braga
J. Bornschein
Samira Ebrahimi Kahou
OffRL
17
1
0
15 Dec 2022
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Joey Hong
Aviral Kumar
Sergey Levine
OffRL
33
20
0
08 Dec 2022
Model-based trajectory stitching for improved behavioural cloning and its applications
Charles A. Hepburn
Giovanni Montana
OffRL
16
5
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
29
0
0
07 Dec 2022
PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement
Wanqi Xue
Qingpeng Cai
Zhenghai Xue
Shuo Sun
Shuchang Liu
Dong Zheng
Peng Jiang
Kun Gai
Bo An
OffRL
28
25
0
06 Dec 2022
TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets
Yuanying Cai
Chuheng Zhang
Li Zhao
Wei Shen
Xuyun Zhang
Lei Song
Jiang Bian
Tao Qin
Tie-Yan Liu
OffRL
17
3
0
05 Dec 2022
Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery
Yiqin Yang
Haotian Hu
Wenzhe Li
Siyuan Li
Jun Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
28
9
0
02 Dec 2022
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning
Marc Rigter
Bruno Lacerda
Nick Hawes
OffRL
16
6
0
30 Nov 2022
Efficient Reinforcement Learning Through Trajectory Generation
Wenqi Cui
Linbin Huang
Weiwei Yang
Baosen Zhang
OffRL
23
0
0
30 Nov 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu-Xiang Wang
William Yang Wang
OffRL
31
15
0
29 Nov 2022
Learning from Good Trajectories in Offline Multi-Agent Reinforcement Learning
Qiangxing Tian
Kun Kuang
Furui Liu
Baoxiang Wang
OffRL
29
9
0
28 Nov 2022
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes
Aviral Kumar
Rishabh Agarwal
Xinyang Geng
George Tucker
Sergey Levine
OffRL
41
48
0
28 Nov 2022
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning
Cheng Chen
Hongyao Tang
Yi Ma
Chao Wang
Qianli Shen
Dong Li
Jianye Hao
OffRL
26
0
0
28 Nov 2022
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
Hiroki Furuta
Yusuke Iwasawa
Yutaka Matsuo
S. Gu
20
14
0
25 Nov 2022
Improving TD3-BC: Relaxed Policy Constraint for Offline Learning and Stable Online Fine-Tuning
Alex Beeson
Giovanni Montana
OffRL
OnRL
18
22
0
21 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
29
13
0
21 Nov 2022
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows
D. Akimov
Vladislav Kurenkov
Alexander Nikulin
Denis Tarasov
Sergey Kolesnikov
OffRL
16
9
0
20 Nov 2022
Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size
Alexander Nikulin
Vladislav Kurenkov
Denis Tarasov
Dmitry Akimov
Sergey Kolesnikov
OffRL
31
14
0
20 Nov 2022
Previous
1
2
3
...
10
11
7
8
9
Next