Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00591
Cited By
v1
v2 (latest)
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
1 July 2021
Seunghyun Lee
Younggyo Seo
Kimin Lee
Pieter Abbeel
Jinwoo Shin
OffRL
OnRL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (56★)
Papers citing
"Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble"
50 / 129 papers shown
Title
Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization
Talha Bozkus
Urbashi Mitra
OffRL
82
5
0
08 Feb 2024
Learning Uncertainty-Aware Temporally-Extended Actions
Joongkyu Lee
Seung Joon Park
Yunhao Tang
Min-hwan Oh
64
2
0
08 Feb 2024
The Essential Role of Causality in Foundation World Models for Embodied AI
Tarun Gupta
Wenbo Gong
Chao Ma
Nick Pawlowski
Agrin Hilmkil
...
Jianfeng Gao
Stefan Bauer
Danica Kragic
Bernhard Schölkopf
Cheng Zhang
92
17
0
06 Feb 2024
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Maciej Wolczyk
Bartłomiej Cupiał
M. Ostaszewski
Michal Bortkiewicz
Michal Zajkac
Razvan Pascanu
Lukasz Kuciñski
Piotr Milo's
CLL
153
18
0
05 Feb 2024
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
Yuanzhao Zhai
Yiying Li
Zijian Gao
Xudong Gong
Kele Xu
Dawei Feng
Bo Ding
Huaimin Wang
OffRL
76
2
0
11 Jan 2024
A unified uncertainty-aware exploration: Combining epistemic and aleatory uncertainty
Parvin Malekzadeh
Ming Hou
Konstantinos N. Plataniotis
UD
96
3
0
05 Jan 2024
Diffusion Reward: Learning Rewards via Conditional Video Diffusion
Tao Huang
Guangqi Jiang
Yanjie Ze
Huazhe Xu
VGen
115
26
0
21 Dec 2023
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Yinmin Zhang
Jie Liu
Chuming Li
Yazhe Niu
Yaodong Yang
Yu Liu
Wanli Ouyang
OffRL
OnRL
133
12
0
12 Dec 2023
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization
Carlos E. Luis
A. Bottero
Julia Vinogradska
Felix Berkenkamp
Jan Peters
OffRL
107
3
0
07 Dec 2023
Pearl: A Production-ready Reinforcement Learning Agent
Zheqing Zhu
Rodrigo de Salvo Braz
Jalaj Bhandari
Daniel Jiang
Yi Wan
...
D. Korenkevych
Ürün Dogan
Frank Cheng
Zheng Wu
Wanqiao Xu
VLM
OffRL
OnRL
129
7
0
06 Dec 2023
Lights out: training RL agents robust to temporary blindness
Nathan Ordonez
Marije Tromp
Pau Marquez Julbe
Wendelin Böhmer
58
0
0
05 Dec 2023
Replay across Experiments: A Natural Extension of Off-Policy RL
Dhruva Tirumala
Thomas Lampe
José Enrique Chen
Tuomas Haarnoja
Sandy Huang
...
Tim Hertweck
Leonard Hasenclever
Martin Riedmiller
N. Heess
Markus Wulfmeier
OffRL
105
8
0
27 Nov 2023
RLIF: Interactive Imitation Learning as Reinforcement Learning
Jianlan Luo
Perry Dong
Yuexiang Zhai
Yi-An Ma
Sergey Levine
OffRL
123
18
0
21 Nov 2023
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
Yifei Zhou
Ayush Sekhari
Yuda Song
Wen Sun
OffRL
OnRL
65
8
0
14 Nov 2023
Accelerating Exploration with Unlabeled Prior Data
Qiyang Li
Jason Zhang
Dibya Ghosh
Amy Zhang
Sergey Levine
OffRL
OnRL
104
9
0
09 Nov 2023
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization
Kun Lei
Zhengmao He
Chenhao Lu
Kaizhe Hu
Yang Gao
Huazhe Xu
OffRL
OnRL
132
13
0
06 Nov 2023
Unsupervised Behavior Extraction via Random Intent Priors
Haotian Hu
Yiqin Yang
Jianing Ye
Ziqing Mai
Chongjie Zhang
OffRL
81
9
0
28 Oct 2023
Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning
Shenzhi Wang
Qisen Yang
Jiawei Gao
Matthieu Lin
Hao Chen
Liwei Wu
Ning Jia
Shiji Song
Gao Huang
OffRL
103
15
0
27 Oct 2023
Finetuning Offline World Models in the Real World
Yunhai Feng
Nicklas Hansen
Ziyan Xiong
Chandramouli Rajagopalan
Xiaolong Wang
OffRL
OnRL
77
22
0
24 Oct 2023
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning
Jingyun Yang
Max Sobol Mark
Brandon Vu
Archit Sharma
Jeannette Bohg
Chelsea Finn
OffRL
OnRL
95
26
0
23 Oct 2023
Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models
Iman Nematollahi
Kirill Yankov
Wolfram Burgard
Tim Welschehold
78
0
0
23 Oct 2023
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias
Max Sobol Mark
Archit Sharma
Fahim Tajwar
Rafael Rafailov
Sergey Levine
Chelsea Finn
OffRL
OnRL
111
2
0
12 Oct 2023
Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning
Trevor A. McInroe
Adam Jelley
Stefano V. Albrecht
Amos Storkey
OffRL
OnRL
76
6
0
09 Oct 2023
Improving Offline-to-Online Reinforcement Learning with Q Conditioned State Entropy Exploration
Ziqi Zhang
Xiao Xiong
Zifeng Zhuang
Jinxin Liu
Donglin Wang
OffRL
OnRL
115
0
0
07 Oct 2023
PCGPT: Procedural Content Generation via Transformers
Sajad Mohaghegh
Mohammad Amin Ramezan Dehnavi
Golnoosh Abdollahinejad
Matin Hashemi
ViT
71
2
0
03 Oct 2023
Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning
Daoce Wang
Chi Jin
OffRL
DiffM
98
35
0
29 Sep 2023
Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness
Xiaoyu Wen
Xudong Yu
Rui Yang
Chenjia Bai
Zhen Wang
OffRL
OnRL
81
10
0
29 Sep 2023
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning
Jianzhun Shao
Yun Qu
Chen Chen
Hongchang Zhang
Xiangyang Ji
OffRL
87
22
0
22 Sep 2023
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps
Haoyi Niu
Tianying Ji
Bingqi Liu
Haocheng Zhao
Xiangyu Zhu
Jianying Zheng
Pengfei Huang
Guyue Zhou
Jianming Hu
Xianyuan Zhan
OffRL
OnRL
AI4CE
120
9
0
22 Sep 2023
Uncertainty-driven Exploration Strategies for Online Grasp Learning
Yitian Shi
Philipp Schillinger
Miroslav Gabriel
Alexander Kuss
Zohar Feldman
Hanna Ziesche
Ngo Anh Vien
OffRL
OnRL
61
4
0
21 Sep 2023
Mitigating the Alignment Tax of RLHF
Yong Lin
Hangyu Lin
Wei Xiong
Shizhe Diao
Zeming Zheng
...
Han Zhao
Nan Jiang
Heng Ji
Yuan Yao
Tong Zhang
MoMe
CLL
112
81
0
12 Sep 2023
Beyond Conservatism: Diffusion Policies in Offline Multi-agent Reinforcement Learning
Zhuoran Li
Ling Pan
Longbo Huang
DiffM
OffRL
66
8
0
04 Jul 2023
ChiPFormer: Transferable Chip Placement via Offline Decision Transformer
Yao Lai
Jinxin Liu
Zhentao Tang
Bin Wang
Jianye Hao
Ping Luo
OffRL
98
43
0
26 Jun 2023
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Hang Wang
Sen Lin
Junshan Zhang
OffRL
OnRL
81
3
0
20 Jun 2023
A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning
Siyuan Guo
Yanchao Sun
Jifeng Hu
Sili Huang
Hechang Chen
Haiyin Piao
Lichao Sun
Yi-Ju Chang
OffRL
OnRL
85
7
0
13 Jun 2023
Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
Kai-Wen Zhao
Yi-An Ma
Jianye Hao
Jinyi Liu
Yan Zheng
Zhaopeng Meng
OffRL
OnRL
111
12
0
12 Jun 2023
Decoupled Prioritized Resampling for Offline RL
Yang Yue
Bingyi Kang
Xiao Ma
Qisen Yang
Gao Huang
S. Song
Shuicheng Yan
OffRL
88
8
0
08 Jun 2023
Survival Instinct in Offline Reinforcement Learning
Anqi Li
Dipendra Kumar Misra
Andrey Kolobov
Ching-An Cheng
OffRL
93
18
0
05 Jun 2023
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji
Yuping Luo
Gang Hua
Xianyuan Zhan
Jianwei Zhang
Huazhe Xu
OffRL
OnRL
116
17
0
05 Jun 2023
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Ya Zhang
OffRL
OnRL
97
19
0
25 May 2023
Revisiting the Minimalist Approach to Offline Reinforcement Learning
Denis Tarasov
Vladislav Kurenkov
Alexander Nikulin
Sergey Kolesnikov
OffRL
100
51
0
16 May 2023
FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing
Kyle Stachowicz
Dhruv Shah
Arjun Bhorkar
Ilya Kostrikov
Sergey Levine
OffRL
75
28
0
19 Apr 2023
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Yicheng Luo
Jackie Kay
Edward Grefenstette
M. Deisenroth
OffRL
OnRL
69
16
0
30 Mar 2023
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Botao Hao
Rahul Jain
Dengwang Tang
Zheng Wen
OffRL
57
3
0
20 Mar 2023
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRL
OnRL
69
24
0
14 Mar 2023
Deploying Offline Reinforcement Learning with Human Feedback
Ziniu Li
Kelvin Xu
Liu Liu
Lanqing Li
Deheng Ye
P. Zhao
OffRL
93
2
0
13 Mar 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Mitsuhiko Nakamoto
Yuexiang Zhai
Anika Singh
Max Sobol Mark
Yi-An Ma
Chelsea Finn
Aviral Kumar
Sergey Levine
OffRL
OnRL
190
125
0
09 Mar 2023
Ensemble Reinforcement Learning: A Survey
Yanjie Song
Ponnuthurai Nagaratnam Suganthan
Witold Pedrycz
Junwei Ou
Yongming He
Y. Chen
Yutong Wu
OffRL
91
41
0
05 Mar 2023
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement Learning
Carolin Schmidt
Daniele Gammelli
Francisco Câmara Pereira
Filipe Rodrigues
OffRL
77
5
0
28 Feb 2023
Efficient Online Reinforcement Learning with Offline Data
Philip J. Ball
Laura M. Smith
Ilya Kostrikov
Sergey Levine
OffRL
OnRL
147
184
0
06 Feb 2023
Previous
1
2
3
Next