Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00591
Cited By
v1
v2 (latest)
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
1 July 2021
Seunghyun Lee
Younggyo Seo
Kimin Lee
Pieter Abbeel
Jinwoo Shin
OffRL
OnRL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (56★)
Papers citing
"Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble"
50 / 129 papers shown
Title
Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis
Ruiquan Huang
Donghao Li
Chengshuai Shi
Cong Shen
Jing Yang
OffRL
110
0
0
01 Jul 2025
MOORL: A Framework for Integrating Offline-Online Reinforcement Learning
Gaurav Chaudhary
Wassim Uddin Mondal
Laxmidhar Behera
OffRL
108
0
0
11 Jun 2025
Reinforcement Learning via Implicit Imitation Guidance
Perry Dong
Alec M. Lessing
Annie S. Chen
Chelsea Finn
OffRL
29
0
0
09 Jun 2025
Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data
Lingkai Kong
Haichuan Wang
Tonghan Wang
Guojun Xiong
Milind Tambe
OffRL
56
0
0
29 May 2025
Universal Value-Function Uncertainties
Moritz A. Zanger
Max Weltevrede
Yaniv Oren
Pascal R. van der Vaart
Caroline Horsch
Wendelin Bohmer
M. Spaan
OffRL
76
0
0
27 May 2025
Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL
Qin-Wen Luo
Ming-Kun Xie
Ye-Wen Wang
Sheng-Jun Huang
OffRL
44
0
0
26 May 2025
MA-ROESL: Motion-aware Rapid Reward Optimization for Efficient Robot Skill Learning from Single Videos
Xinyu Wang
Xinming Zhang
Yanjun Chen
Xiaoyu Shen
Wei Zhang
63
0
0
13 May 2025
What Matters for Batch Online Reinforcement Learning in Robotics?
Perry Dong
Suvir Mirchandani
Dorsa Sadigh
Chelsea Finn
OffRL
60
0
0
12 May 2025
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
141
0
0
05 May 2025
Fine-Tuning without Performance Degradation
Han Wang
Adam White
Martha White
OnRL
438
0
0
01 May 2025
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
88
0
0
26 Apr 2025
Evaluation-Time Policy Switching for Offline Reinforcement Learning
Natinael Solomon Neggatu
Jeremie Houssineau
Giovanni Montana
OffRL
OnRL
112
0
0
15 Mar 2025
Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov
Alexander Nikulin
Ilya Zisman
Albina Klepach
Andrei Polubarov
Nikita Lyubaykin
Alexander Derevyagin
Igor Kiselev
Vladislav Kurenkov
OffRL
OnRL
494
3
0
24 Feb 2025
SAMG: Offline-to-Online Reinforcement Learning via State-Action-Conditional Offline Model Guidance
Liyu Zhang
Haochi Wu
Xu Wan
Quan Kong
Ruilong Deng
Mingyang Sun
OffRL
OnRL
68
0
0
24 Feb 2025
Skill Expansion and Composition in Parameter Space
Tenglong Liu
Junjie Li
Yinan Zheng
Haoyi Niu
Yixing Lan
Xin Xu
Xianyuan Zhan
129
4
0
09 Feb 2025
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy
Yuhui Chen
Shuai Tian
Shugao Liu
Yingting Zhou
Haoran Li
Dongbin Zhao
OffRL
225
13
0
08 Feb 2025
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
Jijia Liu
Feng Gao
Q. Liao
Chao Yu
Yu Wang
OffRL
174
0
0
01 Feb 2025
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Yulong Hu
Tingting Dong
Sen Li
OffRL
OnRL
121
1
0
24 Jan 2025
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Keru Chen
Honghao Wei
Zhigang Deng
Sen Lin
OffRL
OnRL
168
0
0
31 Dec 2024
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation
Fei Zhao
Xueliang Zhang
92
0
0
25 Dec 2024
Efficient Language-instructed Skill Acquisition via Reward-Policy Co-Evolution
Changxin Huang
Yanbin Chang
Junfan Lin
Junyang Liang
Runhao Zeng
Jianqiang Li
114
0
0
18 Dec 2024
Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs
Xiaqiang Tang
Jian Li
Nan Du
Sihong Xie
145
3
0
10 Dec 2024
A Non-Monolithic Policy Approach of Offline-to-Online Reinforcement Learning
JaeYoon Kim
Junyu Xuan
Christy Jie Liang
F. Hussain
OffRL
OnRL
62
0
0
31 Oct 2024
Offline-to-Online Multi-Agent Reinforcement Learning with Offline Value Function Memory and Sequential Exploration
Hai Zhong
Xun Wang
Zhuoran Li
Longbo Huang
OffRL
OnRL
72
1
0
25 Oct 2024
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson
Qiyang Li
Kevin Frans
Sergey Levine
SSL
OffRL
OnRL
189
0
0
23 Oct 2024
Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces
Jifeng Hu
Sili Huang
Li Shen
Zhejian Yang
Shengchao Hu
Shisong Tang
Hechang Chen
Yi Chang
Dacheng Tao
Lichao Sun
OffRL
89
0
0
21 Oct 2024
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Yun Qu
Boyuan Wang
Yuhang Jiang
Jianzhun Shao
Yixiu Mao
Cheems Wang
Chang Liu
Xiangyang Ji
135
5
0
03 Oct 2024
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Jiaheng Hu
Rose Hendrix
Ali Farhadi
Aniruddha Kembhavi
Roberto Martín-Martín
Peter Stone
Kuo-Hao Zeng
Kiana Ehsani
123
15
0
25 Sep 2024
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal Guidance
Renming Huang
Shaochong Liu
Yunqiang Pei
Peng Wang
Guoqing Wang
Yang Yang
Hengtao Shen
OffRL
86
0
0
06 Sep 2024
Diffusion Policy Policy Optimization
Allen Z. Ren
Justin Lidard
Lars L. Ankile
Anthony Simeonov
Pulkit Agrawal
Anirudha Majumdar
Benjamin Burchfiel
Hongkai Dai
Max Simchowitz
165
57
0
01 Sep 2024
Unsupervised-to-Online Reinforcement Learning
Junsu Kim
Seohong Park
Sergey Levine
OnRL
102
5
0
27 Aug 2024
Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks
Yun Qu
Boyuan Wang
Jianzhun Shao
Yuhang Jiang
Chen Chen
...
Qiang Fu
Wei Yang
Guang Yang
Lanxiao Huang
Xiangyang Ji
OffRL
108
10
0
20 Aug 2024
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Xu-Hui Liu
Tian-Shuo Liu
Shengyi Jiang
Ruifeng Chen
Zhilong Zhang
Xinwei Chen
Yang Yu
OffRL
OnRL
83
3
0
17 Jul 2024
A Benchmark Environment for Offline Reinforcement Learning in Racing Games
Girolamo Macaluso
Alessandro Sestini
Andrew D. Bagdanov
OffRL
71
1
0
12 Jul 2024
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao
Yucheng Xin
Silang Wu
Longxiang He
Zichen Yan
Junbo Tan
Xueqian Wang
OffRL
144
1
0
06 Jul 2024
Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Mohammadreza Nakhaei
Aidan Scannell
Joni Pajarinen
OffRL
103
1
0
12 Jun 2024
Hybrid Reinforcement Learning from Offline Observation Alone
Yuda Song
J. Andrew Bagnell
Aarti Singh
OffRL
125
2
0
11 Jun 2024
Strategically Conservative Q-Learning
Yutaka Shimizu
Joey Hong
Sergey Levine
Masayoshi Tomizuka
OffRL
OnRL
86
0
0
06 Jun 2024
DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays
Bo Xia
Yilun Kong
Yongzhe Chang
Bo Yuan
Zhiheng Li
Xueqian Wang
Bin Liang
OffRL
106
3
0
05 Jun 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Haotian Hu
Yiqin Yang
Jianing Ye
Chengjie Wu
Ziqing Mai
Yujing Hu
Tangjie Lv
Changjie Fan
Qianchuan Zhao
Chongjie Zhang
OffRL
OnRL
78
3
0
31 May 2024
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Linjiajie Fang
Ruoxue Liu
Jing Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
179
7
0
31 May 2024
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
OnRL
110
3
0
28 May 2024
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Junghyuk Yeom
Yonghyeon Jo
Jungmo Kim
Sanghyeon Lee
Seungyul Han
OffRL
111
3
0
23 May 2024
vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement
Yiwen Zhu
Jinyi Liu
Wenya Wei
Qianyi Fu
Yujing Hu
Zhou Fang
Bo An
Jianye Hao
Tangjie Lv
Changjie Fan
92
4
0
14 May 2024
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning
Changhong Wang
Xudong Yu
Chenjia Bai
Qiaosheng Zhang
Zhen Wang
84
1
0
12 May 2024
Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning
Dhruva Tirumala
Markus Wulfmeier
Ben Moran
Sandy Huang
Jan Humplik
...
Kushal Patel
Marlon Gwira
Francesco Nori
Martin Riedmiller
N. Heess
78
14
0
03 May 2024
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning
Chenjia Bai
Lingxiao Wang
Jianye Hao
Zhuoran Yang
Bin Zhao
Zhen Wang
Xuelong Li
OffRL
84
9
0
30 Apr 2024
Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models
Xingyuan Zhang
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
109
0
0
29 Apr 2024
Enhancing Reinforcement Learning Agents with Local Guides
Paul Daoudi
Bogdan Robu
Christophe Prieur
Ludovic Dos Santos
M. Barlier
OnRL
88
3
0
21 Feb 2024
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss
Ruijie Zheng
Yongyuan Liang
Xiyao Wang
Shuang Ma
Hal Daumé
Huazhe Xu
John Langford
Praveen Palanisamy
Kalyan Shankar Basu
Furong Huang
107
8
0
09 Feb 2024
1
2
3
Next