ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.06860
  4. Cited By
A Minimalist Approach to Offline Reinforcement Learning

A Minimalist Approach to Offline Reinforcement Learning

12 June 2021
Scott Fujimoto
S. Gu
    OffRL
ArXivPDFHTML

Papers citing "A Minimalist Approach to Offline Reinforcement Learning"

50 / 522 papers shown
Title
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRL
OnRL
15
21
0
14 Mar 2023
Synthetic Experience Replay
Synthetic Experience Replay
Cong Lu
Philip J. Ball
Yee Whye Teh
Jack Parker-Holder
OffRL
94
67
0
12 Mar 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online
  Fine-Tuning
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Mitsuhiko Nakamoto
Yuexiang Zhai
Anika Singh
Max Sobol Mark
Yi Ma
Chelsea Finn
Aviral Kumar
Sergey Levine
OffRL
OnRL
112
108
0
09 Mar 2023
Environment Transformer and Policy Optimization for Model-Based Offline
  Reinforcement Learning
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning
Pengqin Wang
Meixin Zhu
Shaojie Shen
OffRL
27
1
0
07 Mar 2023
Graph Decision Transformer
Graph Decision Transformer
Shengchao Hu
Li Shen
Ya-Qin Zhang
Dacheng Tao
OffRL
30
15
0
07 Mar 2023
Decision Transformer under Random Frame Dropping
Decision Transformer under Random Frame Dropping
Kaizhe Hu
Rachel Zheng
Yang Gao
Huazhe Xu
OffRL
126
12
0
03 Mar 2023
The In-Sample Softmax for Offline Reinforcement Learning
The In-Sample Softmax for Offline Reinforcement Learning
Chenjun Xiao
Han Wang
Yangchen Pan
Adam White
Martha White
OffRL
24
26
0
28 Feb 2023
The Provable Benefits of Unsupervised Data Sharing for Offline
  Reinforcement Learning
The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning
Haotian Hu
Yiqin Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
11
5
0
27 Feb 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function
  Approximation
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
21
0
0
25 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function
  Approximation
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
46
5
0
24 Feb 2023
Neural Laplace Control for Continuous-time Delayed Systems
Neural Laplace Control for Continuous-time Delayed Systems
Samuel Holt
Alihan Huyuk
Zhaozhi Qian
Hao Sun
M. Schaar
OffRL
26
10
0
24 Feb 2023
Behavior Proximal Policy Optimization
Behavior Proximal Policy Optimization
Zifeng Zhuang
Kun Lei
Jinxin Liu
Donglin Wang
Yilang Guo
OffRL
27
34
0
22 Feb 2023
Adversarial Model for Offline Reinforcement Learning
Adversarial Model for Offline Reinforcement Learning
M. Bhardwaj
Tengyang Xie
Byron Boots
Nan Jiang
Ching-An Cheng
AAML
OffRL
32
25
0
21 Feb 2023
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning
  for Task-oriented Dialogue Systems
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems
Yihao Feng
Shentao Yang
Shujian Zhang
Jianguo Zhang
Caiming Xiong
Mi Zhou
Haiquan Wang
OffRL
28
24
0
20 Feb 2023
Swapped goal-conditioned offline reinforcement learning
Swapped goal-conditioned offline reinforcement learning
Wenyan Yang
Huiling Wang
Dingding Cai
Joni Pajarinen
Joni-Kristen Kämäräinen
OffRL
OnRL
30
1
0
17 Feb 2023
Dual RL: Unification and New Methods for Reinforcement and Imitation
  Learning
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
Harshit S. Sikchi
Qinqing Zheng
Amy Zhang
S. Niekum
OffRL
30
19
0
16 Feb 2023
Prioritized offline Goal-swapping Experience Replay
Prioritized offline Goal-swapping Experience Replay
Wenyan Yang
Joni Pajarinen
Dinging Cai
Joni Kämäräinen
OffRL
OnRL
26
0
0
15 Feb 2023
Constrained Decision Transformer for Offline Safe Reinforcement Learning
Constrained Decision Transformer for Offline Safe Reinforcement Learning
Zuxin Liu
Zijian Guo
Yi-Fan Yao
Zhepeng Cen
Wenhao Yu
Tingnan Zhang
Ding Zhao
OffRL
31
46
0
14 Feb 2023
Conservative State Value Estimation for Offline Reinforcement Learning
Conservative State Value Estimation for Offline Reinforcement Learning
Liting Chen
Jie Yan
Zhengdao Shao
Lu Wang
Qingwei Lin
Saravan Rajmohan
Thomas Moscibroda
Dongmei Zhang
OffRL
18
5
0
14 Feb 2023
A Strong Baseline for Batch Imitation Learning
A Strong Baseline for Batch Imitation Learning
Matthew Smith
Lucas Maystre
Zhenwen Dai
K. Ciosek
OffRL
17
4
0
06 Feb 2023
Reinforcing User Retention in a Billion Scale Short Video Recommender
  System
Reinforcing User Retention in a Billion Scale Short Video Recommender System
Qingpeng Cai
Shuchang Liu
Xueliang Wang
Tianyou Zuo
Wentao Xie
Bin Yang
Dong Zheng
Peng Jiang
Kun Gai
OffRL
22
41
0
03 Feb 2023
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Qing-Shan Jia
Ya-Qin Zhang
OffRL
38
19
0
03 Feb 2023
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Haichao Zhang
Weiwen Xu
Haonan Yu
CLL
OffRL
OnRL
40
62
0
02 Feb 2023
Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent
  Reinforcement Learning
Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning
Claude Formanek
Asad Jeewa
Jonathan P. Shock
Arnu Pretorius
OffRL
40
1
0
01 Feb 2023
Anti-Exploration by Random Network Distillation
Anti-Exploration by Random Network Distillation
Alexander Nikulin
Vladislav Kurenkov
Denis Tarasov
Sergey Kolesnikov
38
24
0
31 Jan 2023
Identifying Expert Behavior in Offline Training Datasets Improves
  Behavioral Cloning of Robotic Manipulation Policies
Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies
Qiang-qiang Wang
Robert McCarthy
David Córdova Bulens
Francisco Roldan Sanchez
Kevin McGuinness
Noel E. O'Connor
S. Redmond
OffRL
25
3
0
30 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
38
15
0
30 Jan 2023
Constrained Policy Optimization with Explicit Behavior Density for
  Offline Reinforcement Learning
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
Jing Zhang
Chi Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
29
7
0
28 Jan 2023
Improving Behavioural Cloning with Positive Unlabeled Learning
Improving Behavioural Cloning with Positive Unlabeled Learning
Qiang-qiang Wang
Robert McCarthy
David Córdova Bulens
Kevin McGuinness
Noel E. O'Connor
Nico Gürtler
Felix Widmaier
Francisco Roldan Sanchez
S. Redmond
OffRL
OnRL
24
8
0
27 Jan 2023
Plan To Predict: Learning an Uncertainty-Foreseeing Model for
  Model-Based Reinforcement Learning
Plan To Predict: Learning an Uncertainty-Foreseeing Model for Model-Based Reinforcement Learning
Zifan Wu
Chao Yu
Cheng Chen
Jianye Hao
H. Zhuo
11
16
0
20 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg
Joey Hejna
M. Geist
Stefano Ermon
OffRL
33
63
0
05 Jan 2023
Imitation Is Not Enough: Robustifying Imitation with Reinforcement
  Learning for Challenging Driving Scenarios
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Yiren Lu
Justin Fu
George Tucker
Xinlei Pan
Eli Bronstein
...
Brandyn White
Aleksandra Faust
Shimon Whiteson
Drago Anguelov
Sergey Levine
OffRL
26
92
0
21 Dec 2022
Bridging the Gap Between Offline and Online Reinforcement Learning
  Evaluation Methodologies
Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies
Shivakanth Sujit
Pedro H. M. Braga
J. Bornschein
Samira Ebrahimi Kahou
OffRL
17
1
0
15 Dec 2022
Confidence-Conditioned Value Functions for Offline Reinforcement
  Learning
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Joey Hong
Aviral Kumar
Sergey Levine
OffRL
33
20
0
08 Dec 2022
Model-based trajectory stitching for improved behavioural cloning and
  its applications
Model-based trajectory stitching for improved behavioural cloning and its applications
Charles A. Hepburn
Giovanni Montana
OffRL
16
5
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy
  Constraints and Q-Ensemble
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
29
0
0
07 Dec 2022
PrefRec: Recommender Systems with Human Preferences for Reinforcing
  Long-term User Engagement
PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement
Wanqi Xue
Qingpeng Cai
Zhenghai Xue
Shuo Sun
Shuchang Liu
Dong Zheng
Peng Jiang
Kun Gai
Bo An
OffRL
28
25
0
06 Dec 2022
TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from
  Mixed Datasets
TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets
Yuanying Cai
Chuheng Zhang
Li Zhao
Wei Shen
Xuyun Zhang
Lei Song
Jiang Bian
Tao Qin
Tie-Yan Liu
OffRL
17
3
0
05 Dec 2022
Flow to Control: Offline Reinforcement Learning with Lossless Primitive
  Discovery
Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery
Yiqin Yang
Haotian Hu
Wenzhe Li
Siyuan Li
Jun Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
28
9
0
02 Dec 2022
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based
  Offline Reinforcement Learning
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning
Marc Rigter
Bruno Lacerda
Nick Hawes
OffRL
16
6
0
30 Nov 2022
Efficient Reinforcement Learning Through Trajectory Generation
Efficient Reinforcement Learning Through Trajectory Generation
Wenqi Cui
Linbin Huang
Weiwei Yang
Baosen Zhang
OffRL
23
0
0
30 Nov 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement
  Operators
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu-Xiang Wang
William Yang Wang
OffRL
31
15
0
29 Nov 2022
Learning from Good Trajectories in Offline Multi-Agent Reinforcement
  Learning
Learning from Good Trajectories in Offline Multi-Agent Reinforcement Learning
Qiangxing Tian
Kun Kuang
Furui Liu
Baoxiang Wang
OffRL
29
9
0
28 Nov 2022
Offline Q-Learning on Diverse Multi-Task Data Both Scales And
  Generalizes
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes
Aviral Kumar
Rishabh Agarwal
Xinyang Geng
George Tucker
Sergey Levine
OffRL
41
48
0
28 Nov 2022
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement
  Learning
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning
Cheng Chen
Hongyao Tang
Yi Ma
Chao Wang
Qianli Shen
Dong Li
Jianye Hao
OffRL
26
0
0
28 Nov 2022
A System for Morphology-Task Generalization via Unified Representation
  and Behavior Distillation
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
Hiroki Furuta
Yusuke Iwasawa
Yutaka Matsuo
S. Gu
20
14
0
25 Nov 2022
Improving TD3-BC: Relaxed Policy Constraint for Offline Learning and
  Stable Online Fine-Tuning
Improving TD3-BC: Relaxed Policy Constraint for Offline Learning and Stable Online Fine-Tuning
Alex Beeson
Giovanni Montana
OffRL
OnRL
18
22
0
21 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement
  Learning
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
29
13
0
21 Nov 2022
Let Offline RL Flow: Training Conservative Agents in the Latent Space of
  Normalizing Flows
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows
D. Akimov
Vladislav Kurenkov
Alexander Nikulin
Denis Tarasov
Sergey Kolesnikov
OffRL
16
9
0
20 Nov 2022
Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch
  Size
Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size
Alexander Nikulin
Vladislav Kurenkov
Denis Tarasov
Dmitry Akimov
Sergey Kolesnikov
OffRL
31
14
0
20 Nov 2022
Previous
123...1011789
Next