Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.05479
Cited By
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
9 March 2023
Mitsuhiko Nakamoto
Yuexiang Zhai
Anika Singh
Max Sobol Mark
Yi Ma
Chelsea Finn
Aviral Kumar
Sergey Levine
OffRL
OnRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning"
50 / 90 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
33
0
0
16 May 2025
What Matters for Batch Online Reinforcement Learning in Robotics?
Perry Dong
Suvir Mirchandani
Dorsa Sadigh
Chelsea Finn
OffRL
36
0
0
12 May 2025
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
60
0
0
05 May 2025
Fine-Tuning without Performance Degradation
Han Wang
Adam White
Martha White
OnRL
250
0
0
01 May 2025
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
52
0
0
26 Apr 2025
Efficient Reinforcement Learning by Guiding Generalist World Models with Non-Curated Data
Yi Zhao
Aidan Scannell
Wenshuai Zhao
Yuxin Hou
Tianyu Cui
Le Chen
Dieter Büchler
Arno Solin
Juho Kannala
Joni Pajarinen
OffRL
OnRL
98
1
0
26 Feb 2025
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
Mingyang Sun
Pengxiang Ding
Weinan Zhang
Donglin Wang
OT
88
0
0
24 Feb 2025
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy
Yuhui Chen
Shuai Tian
Shugao Liu
Yingting Zhou
Haoran Li
Dongbin Zhao
OffRL
106
1
0
08 Feb 2025
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Patrick Yin
Tyler Westenbroek
Simran Bagaria
Kevin Huang
Ching-an Cheng
Andrey Kobolov
Abhishek Gupta
85
2
0
04 Feb 2025
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
Jijia Liu
Feng Gao
Q. Liao
Chao Yu
Yu Wang
OffRL
76
0
0
01 Feb 2025
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Yulong Hu
Tingting Dong
Sen Li
OffRL
OnRL
67
0
0
24 Jan 2025
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Keru Chen
Honghao Wei
Zhigang Deng
Sen Lin
OffRL
OnRL
96
0
0
31 Dec 2024
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation
Fei Zhao
Xueliang Zhang
36
0
0
25 Dec 2024
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
Xiu Yuan
Tongzhou Mu
Stone Tao
Yunhao Fang
Mengke Zhang
H. Su
OffRL
76
3
0
18 Dec 2024
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
Max Sobol Mark
Tian Gao
Georgia Gabriela Sampaio
Mohan Kumar Srirama
Archit Sharma
Chelsea Finn
Aviral Kumar
OffRL
OnRL
106
4
0
09 Dec 2024
Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers
Kai Yan
Alex Schwing
Yu-xiong Wang
OffRL
OnRL
41
0
0
31 Oct 2024
Offline Behavior Distillation
Shiye Lei
Sen Zhang
Dacheng Tao
OffRL
41
0
0
30 Oct 2024
Offline-to-Online Multi-Agent Reinforcement Learning with Offline Value Function Memory and Sequential Exploration
Hai Zhong
Xun Wang
Zhuoran Li
Longbo Huang
OffRL
OnRL
34
0
0
25 Oct 2024
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson
Qiyang Li
Kevin Frans
Sergey Levine
SSL
OffRL
OnRL
61
0
0
23 Oct 2024
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRL
OnRL
67
0
0
19 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
79
14
0
17 Oct 2024
LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models
Hossein Abdi
Mingfei Sun
Andi Zhang
Samuel Kaski
Wei Pan
30
0
0
15 Oct 2024
From Reward Shaping to Q-Shaping: Achieving Unbiased Learning with LLM-Guided Knowledge
Xiefeng Wu
OffRL
34
1
0
02 Oct 2024
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner
Chenyou Fan
Chenjia Bai
Zhao Shan
Haoran He
Yang Zhang
Zhen Wang
40
3
0
30 Sep 2024
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Jiaheng Hu
Rose Hendrix
Ali Farhadi
Aniruddha Kembhavi
Roberto Martín-Martín
Peter Stone
Kuo-Hao Zeng
Kiana Ehsani
53
7
0
25 Sep 2024
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal Guidance
Renming Huang
Shaochong Liu
Yunqiang Pei
Peng Wang
Guoqing Wang
Yang Yang
Hengtao Shen
OffRL
42
0
0
06 Sep 2024
Diffusion Policy Policy Optimization
Allen Z. Ren
Justin Lidard
Lars L. Ankile
Anthony Simeonov
Pulkit Agrawal
Anirudha Majumdar
Benjamin Burchfiel
Hongkai Dai
Max Simchowitz
59
38
0
01 Sep 2024
Unsupervised-to-Online Reinforcement Learning
Junsu Kim
Seohong Park
Sergey Levine
OnRL
65
3
0
27 Aug 2024
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
Rafael Rafailov
Kyle Hatch
Anikait Singh
Laura Smith
Aviral Kumar
...
Victor Kolev
Philip J. Ball
Jiajun Wu
Chelsea Finn
Sergey Levine
OffRL
34
3
0
15 Aug 2024
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
Kevin Tan
Wei Fan
Yuting Wei
OffRL
77
3
0
08 Aug 2024
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Xu-Hui Liu
Tian-Shuo Liu
Shengyi Jiang
Ruifeng Chen
Zhilong Zhang
Xinwei Chen
Yang Yu
OffRL
OnRL
38
2
0
17 Jul 2024
Affordance-Guided Reinforcement Learning via Visual Prompting
Olivia Y. Lee
Annie Xie
Kuan Fang
Karl Pertsch
Chelsea Finn
OffRL
LM&Ro
76
9
0
14 Jul 2024
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao
Yucheng Xin
Silang Wu
Longxiang He
Zichen Yan
Junbo Tan
Xueqian Wang
OffRL
69
0
0
06 Jul 2024
Hybrid Reinforcement Learning from Offline Observation Alone
Yuda Song
J. Andrew Bagnell
Aarti Singh
OffRL
86
2
0
11 Jun 2024
Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning
Donghu Kim
Hojoon Lee
Kyungmin Lee
Dongyoon Hwang
Jaegul Choo
OffRL
46
1
0
10 Jun 2024
Strategically Conservative Q-Learning
Yutaka Shimizu
Joey Hong
Sergey Levine
Masayoshi Tomizuka
OffRL
OnRL
50
0
0
06 Jun 2024
Transductive Off-policy Proximal Policy Optimization
Yaozhong Gan
Renye Yan
Xiaoyang Tan
Zhe Wu
Junliang Xing
OffRL
37
2
0
06 Jun 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Haotian Hu
Yiqin Yang
Jianing Ye
Chengjie Wu
Ziqing Mai
Yujing Hu
Tangjie Lv
Changjie Fan
Qianchuan Zhao
Chongjie Zhang
OffRL
OnRL
45
3
0
31 May 2024
Leveraging Offline Data in Linear Latent Bandits
Chinmaya Kausik
Kevin Tan
Ambuj Tewari
OffRL
51
2
0
27 May 2024
How to Leverage Diverse Demonstrations in Offline Imitation Learning
Sheng Yue
Jiani Liu
Xingyuan Hua
Ju Ren
Sen Lin
Junshan Zhang
Yaoxue Zhang
OffRL
34
3
0
24 May 2024
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning
Changhong Wang
Xudong Yu
Chenjia Bai
Qiaosheng Zhang
Zhen Wang
40
1
0
12 May 2024
Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning
Stone Tao
Arth Shukla
Tse-kai Chan
Hao Su
OffRL
41
4
0
06 May 2024
DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets
Xiaoyu Huang
Yufeng Chi
Ruofeng Wang
Zhongyu Li
Xue Bin Peng
Sophia Shao
Borivoje Nikolic
Koushil Sreenath
OffRL
83
27
0
30 Apr 2024
Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models
Xingyuan Zhang
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
50
0
0
29 Apr 2024
ASID: Active Exploration for System Identification in Robotic Manipulation
Marius Memmel
Andrew Wagenmaker
Chuning Zhu
Patrick Yin
Dieter Fox
Abhishek Gupta
42
13
0
18 Apr 2024
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage
Kevin Tan
Ziping Xu
OffRL
OnRL
42
5
0
07 Mar 2024
SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation
Noriaki Hirose
Dhruv Shah
Kyle Stachowicz
A. Sridhar
Sergey Levine
71
5
0
01 Mar 2024
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
Yanxiao Zhao
Yangge Qian
Tianyi Wang
Jingyang Shan
Xiaolin Qin
29
0
0
01 Mar 2024
Foundation Policies with Hilbert Representations
Seohong Park
Tobias Kreiman
Sergey Levine
SSL
OffRL
55
21
0
23 Feb 2024
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation
Wilbert Pumacay
Ishika Singh
Jiafei Duan
Ranjay Krishna
Jesse Thomason
Dieter Fox
29
40
0
13 Feb 2024
1
2
Next