ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00591
  4. Cited By
Offline-to-Online Reinforcement Learning via Balanced Replay and
  Pessimistic Q-Ensemble
v1v2 (latest)

Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble

1 July 2021
Seunghyun Lee
Younggyo Seo
Kimin Lee
Pieter Abbeel
Jinwoo Shin
    OffRLOnRL
ArXiv (abs)PDFHTMLGithub (56★)

Papers citing "Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble"

29 / 129 papers shown
Title
Accelerating Policy Gradient by Estimating Value Function from Prior
  Computation in Deep Reinforcement Learning
Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning
Hassam Sheikh
Mariano Phielipp
OffRL
64
6
0
02 Feb 2023
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Haichao Zhang
Weiwen Xu
Haonan Yu
CLLOffRLOnRL
124
69
0
02 Feb 2023
SMART: Self-supervised Multi-task pretrAining with contRol Transformers
SMART: Self-supervised Multi-task pretrAining with contRol Transformers
Yanchao Sun
Shuang Ma
Ratnesh Madaan
Rogerio Bonatti
Furong Huang
Ashish Kapoor
102
42
0
24 Jan 2023
Accelerating Self-Imitation Learning from Demonstrations via Policy
  Constraints and Q-Ensemble
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
69
1
0
07 Dec 2022
Offline Supervised Learning V.S. Online Direct Policy Optimization: A
  Comparative Study and A Unified Training Paradigm for Neural Network-Based
  Optimal Feedback Control
Offline Supervised Learning V.S. Online Direct Policy Optimization: A Comparative Study and A Unified Training Paradigm for Neural Network-Based Optimal Feedback Control
Yue Zhao
Jiequn Han
OffRL
69
6
0
29 Nov 2022
Hypernetworks for Zero-shot Transfer in Reinforcement Learning
Hypernetworks for Zero-shot Transfer in Reinforcement Learning
S. Rezaei-Shoshtari
Charlotte Morissette
F. Hogan
Gregory Dudek
David Meger
OffRL
105
15
0
28 Nov 2022
Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch
  Size
Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size
Alexander Nikulin
Vladislav Kurenkov
Denis Tarasov
Dmitry Akimov
Sergey Kolesnikov
OffRL
86
15
0
20 Nov 2022
Pretraining in Deep Reinforcement Learning: A Survey
Pretraining in Deep Reinforcement Learning: A Survey
Zhihui Xie
Zichuan Lin
Junyou Li
Shuai Li
Deheng Ye
OffRLOnRLAI4CE
87
23
0
08 Nov 2022
Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for
  Industrial Insertion of Novel Connectors from Vision
Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision
Ashvin Nair
Brian Zhu
Gokul Narayanan
Eugen Solowjow
Sergey Levine
OffRLOnRL
136
16
0
27 Oct 2022
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online
  Reinforcement Learning
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning
Yi Zhao
Rinu Boney
Alexander Ilin
Arno Solin
Joni Pajarinen
OffRLOnRL
89
40
0
25 Oct 2022
Sustainable Online Reinforcement Learning for Auto-bidding
Sustainable Online Reinforcement Learning for Auto-bidding
Zhiyu Mou
Yusen Huo
Rongquan Bai
Mingzhou Xie
Chuan Yu
Jian Xu
Bo Zheng
OffRLOnRL
100
17
0
13 Oct 2022
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Yuda Song
Yi Zhou
Ayush Sekhari
J. Andrew Bagnell
A. Krishnamurthy
Wen Sun
OffRLOnRL
94
105
0
13 Oct 2022
Generalization with Lossy Affordances: Leveraging Broad Offline Data for
  Learning Visuomotor Tasks
Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks
Kuan Fang
Patrick Yin
Ashvin Nair
Homer Walke
Ge Yan
Sergey Levine
OffRL
96
25
0
12 Oct 2022
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a
  Handful of Trials
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Aviral Kumar
Anika Singh
F. Ebert
Mitsuhiko Nakamoto
Yanlai Yang
Chelsea Finn
Sergey Levine
OffRLOnRL
218
71
0
11 Oct 2022
C^2:Co-design of Robots via Concurrent Networks Coupling Online and
  Offline Reinforcement Learning
C^2:Co-design of Robots via Concurrent Networks Coupling Online and Offline Reinforcement Learning
Ci Chen
Pingyu Xiang
Haojian Lu
Yue Wang
R. Xiong
OffRL
93
3
0
14 Sep 2022
A Review of Uncertainty for Deep Reinforcement Learning
A Review of Uncertainty for Deep Reinforcement Learning
Owen Lockwood
Mei Si
89
43
0
18 Aug 2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Jiafei Lyu
Xiaoteng Ma
Xiu Li
Zongqing Lu
OffRL
113
114
0
09 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Rameswar Panda
Marc G. Bellemare
OffRLOnRL
126
66
0
03 Jun 2022
Why So Pessimistic? Estimating Uncertainties for Offline RL through
  Ensembles, and Why Their Independence Matters
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters
Seyed Kamyar Seyed Ghasemipour
S. Gu
Ofir Nachum
OffRL
90
72
0
27 May 2022
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
Han Wang
Archit Sakhadeo
Adam White
James Bell
Vincent Liu
Xutong Zhao
Puer Liu
Tadashi Kozuno
Alona Fyshe
Martha White
OffRLOnRL
124
8
0
18 May 2022
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
  Latent Space
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space
Kuan Fang
Patrick Yin
Ashvin Nair
Sergey Levine
OffRL
109
35
0
17 May 2022
How to Spend Your Robot Time: Bridging Kickstarting and Offline
  Reinforcement Learning for Vision-based Robotic Manipulation
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation
Alex X. Lee
Coline Devin
Jost Tobias Springenberg
Yuxiang Zhou
Thomas Lampe
A. Abdolmaleki
Konstantinos Bousmalis
OffRLOnRL
86
16
0
06 May 2022
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open
  Problems
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems
Rafael Figueiredo Prudencio
Marcos R. O. A. Máximo
Esther Luna Colombini
OffRL
111
244
0
02 Mar 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement
  Learning
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning
Chenjia Bai
Lingxiao Wang
Zhuoran Yang
Zhihong Deng
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
106
141
0
23 Feb 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
71
9
0
23 Feb 2022
VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
Che Wang
Xufang Luo
George Andriopoulos
Dongsheng Li
OffRL
123
51
0
17 Feb 2022
Online Decision Transformer
Online Decision Transformer
Qinqing Zheng
Amy Zhang
Aditya Grover
OffRL
93
209
0
11 Feb 2022
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement
  Learning with Actor Rectification
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification
L. Pan
Longbo Huang
Tengyu Ma
Huazhe Xu
OffRLOnRL
121
55
0
22 Nov 2021
Value Penalized Q-Learning for Recommender Systems
Value Penalized Q-Learning for Recommender Systems
Chengqian Gao
Ke Xu
Kuangqi Zhou
Lanqing Li
Xueqian Wang
Bo Yuan
P. Zhao
OffRL
107
20
0
15 Oct 2021
Previous
123