ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.05479
  4. Cited By
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online
  Fine-Tuning

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

9 March 2023
Mitsuhiko Nakamoto
Yuexiang Zhai
Anika Singh
Max Sobol Mark
Yi Ma
Chelsea Finn
Aviral Kumar
Sergey Levine
    OffRL
    OnRL
ArXivPDFHTML

Papers citing "Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning"

50 / 90 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
33
0
0
16 May 2025
What Matters for Batch Online Reinforcement Learning in Robotics?
What Matters for Batch Online Reinforcement Learning in Robotics?
Perry Dong
Suvir Mirchandani
Dorsa Sadigh
Chelsea Finn
OffRL
36
0
0
12 May 2025
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
60
0
0
05 May 2025
Fine-Tuning without Performance Degradation
Fine-Tuning without Performance Degradation
Han Wang
Adam White
Martha White
OnRL
250
0
0
01 May 2025
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
52
0
0
26 Apr 2025
Efficient Reinforcement Learning by Guiding Generalist World Models with Non-Curated Data
Efficient Reinforcement Learning by Guiding Generalist World Models with Non-Curated Data
Yi Zhao
Aidan Scannell
Wenshuai Zhao
Yuxin Hou
Tianyu Cui
Le Chen
Dieter Büchler
Arno Solin
Juho Kannala
Joni Pajarinen
OffRL
OnRL
98
1
0
26 Feb 2025
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
Mingyang Sun
Pengxiang Ding
Weinan Zhang
Donglin Wang
OT
88
0
0
24 Feb 2025
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy
Yuhui Chen
Shuai Tian
Shugao Liu
Yingting Zhou
Haoran Li
Dongbin Zhao
OffRL
106
1
0
08 Feb 2025
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Patrick Yin
Tyler Westenbroek
Simran Bagaria
Kevin Huang
Ching-an Cheng
Andrey Kobolov
Abhishek Gupta
85
2
0
04 Feb 2025
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
Jijia Liu
Feng Gao
Q. Liao
Chao Yu
Yu Wang
OffRL
76
0
0
01 Feb 2025
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Yulong Hu
Tingting Dong
Sen Li
OffRL
OnRL
67
0
0
24 Jan 2025
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Keru Chen
Honghao Wei
Zhigang Deng
Sen Lin
OffRL
OnRL
96
0
0
31 Dec 2024
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo
  Cancellation
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation
Fei Zhao
Xueliang Zhang
36
0
0
25 Dec 2024
Policy Decorator: Model-Agnostic Online Refinement for Large Policy
  Model
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
Xiu Yuan
Tongzhou Mu
Stone Tao
Yunhao Fang
Mengke Zhang
H. Su
OffRL
76
3
0
18 Dec 2024
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class
  and Backbone
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
Max Sobol Mark
Tian Gao
Georgia Gabriela Sampaio
Mohan Kumar Srirama
Archit Sharma
Chelsea Finn
Aviral Kumar
OffRL
OnRL
106
4
0
09 Dec 2024
Reinforcement Learning Gradients as Vitamin for Online Finetuning
  Decision Transformers
Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers
Kai Yan
Alex Schwing
Yu-xiong Wang
OffRL
OnRL
41
0
0
31 Oct 2024
Offline Behavior Distillation
Offline Behavior Distillation
Shiye Lei
Sen Zhang
Dacheng Tao
OffRL
41
0
0
30 Oct 2024
Offline-to-Online Multi-Agent Reinforcement Learning with Offline Value
  Function Memory and Sequential Exploration
Offline-to-Online Multi-Agent Reinforcement Learning with Offline Value Function Memory and Sequential Exploration
Hai Zhong
Xun Wang
Zhuoran Li
Longbo Huang
OffRL
OnRL
34
0
0
25 Oct 2024
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson
Qiyang Li
Kevin Frans
Sergey Levine
SSL
OffRL
OnRL
61
0
0
23 Oct 2024
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRL
OnRL
67
0
0
19 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
79
14
0
17 Oct 2024
LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models
LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models
Hossein Abdi
Mingfei Sun
Andi Zhang
Samuel Kaski
Wei Pan
30
0
0
15 Oct 2024
From Reward Shaping to Q-Shaping: Achieving Unbiased Learning with
  LLM-Guided Knowledge
From Reward Shaping to Q-Shaping: Achieving Unbiased Learning with LLM-Guided Knowledge
Xiefeng Wu
OffRL
34
1
0
02 Oct 2024
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner
Chenyou Fan
Chenjia Bai
Zhao Shan
Haoran He
Yang Zhang
Zhen Wang
40
3
0
30 Sep 2024
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale
  Reinforcement Learning Fine-Tuning
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Jiaheng Hu
Rose Hendrix
Ali Farhadi
Aniruddha Kembhavi
Roberto Martín-Martín
Peter Stone
Kuo-Hao Zeng
Kiana Ehsani
53
7
0
25 Sep 2024
Goal-Reaching Policy Learning from Non-Expert Observations via Effective
  Subgoal Guidance
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal Guidance
Renming Huang
Shaochong Liu
Yunqiang Pei
Peng Wang
Guoqing Wang
Yang Yang
Hengtao Shen
OffRL
42
0
0
06 Sep 2024
Diffusion Policy Policy Optimization
Diffusion Policy Policy Optimization
Allen Z. Ren
Justin Lidard
Lars L. Ankile
Anthony Simeonov
Pulkit Agrawal
Anirudha Majumdar
Benjamin Burchfiel
Hongkai Dai
Max Simchowitz
59
38
0
01 Sep 2024
Unsupervised-to-Online Reinforcement Learning
Unsupervised-to-Online Reinforcement Learning
Junsu Kim
Seohong Park
Sergey Levine
OnRL
65
3
0
27 Aug 2024
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
Rafael Rafailov
Kyle Hatch
Anikait Singh
Laura Smith
Aviral Kumar
...
Victor Kolev
Philip J. Ball
Jiajun Wu
Chelsea Finn
Sergey Levine
OffRL
34
3
0
15 Aug 2024
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
Kevin Tan
Wei Fan
Yuting Wei
OffRL
77
3
0
08 Aug 2024
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement
  Learning
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Xu-Hui Liu
Tian-Shuo Liu
Shengyi Jiang
Ruifeng Chen
Zhilong Zhang
Xinwei Chen
Yang Yu
OffRL
OnRL
38
2
0
17 Jul 2024
Affordance-Guided Reinforcement Learning via Visual Prompting
Affordance-Guided Reinforcement Learning via Visual Prompting
Olivia Y. Lee
Annie Xie
Kuan Fang
Karl Pertsch
Chelsea Finn
OffRL
LM&Ro
76
9
0
14 Jul 2024
FOSP: Fine-tuning Offline Safe Policy through World Models
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao
Yucheng Xin
Silang Wu
Longxiang He
Zichen Yan
Junbo Tan
Xueqian Wang
OffRL
69
0
0
06 Jul 2024
Hybrid Reinforcement Learning from Offline Observation Alone
Hybrid Reinforcement Learning from Offline Observation Alone
Yuda Song
J. Andrew Bagnell
Aarti Singh
OffRL
86
2
0
11 Jun 2024
Investigating Pre-Training Objectives for Generalization in Vision-Based
  Reinforcement Learning
Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning
Donghu Kim
Hojoon Lee
Kyungmin Lee
Dongyoon Hwang
Jaegul Choo
OffRL
46
1
0
10 Jun 2024
Strategically Conservative Q-Learning
Strategically Conservative Q-Learning
Yutaka Shimizu
Joey Hong
Sergey Levine
Masayoshi Tomizuka
OffRL
OnRL
50
0
0
06 Jun 2024
Transductive Off-policy Proximal Policy Optimization
Transductive Off-policy Proximal Policy Optimization
Yaozhong Gan
Renye Yan
Xiaoyang Tan
Zhe Wu
Junliang Xing
OffRL
37
2
0
06 Jun 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Haotian Hu
Yiqin Yang
Jianing Ye
Chengjie Wu
Ziqing Mai
Yujing Hu
Tangjie Lv
Changjie Fan
Qianchuan Zhao
Chongjie Zhang
OffRL
OnRL
45
3
0
31 May 2024
Leveraging Offline Data in Linear Latent Bandits
Leveraging Offline Data in Linear Latent Bandits
Chinmaya Kausik
Kevin Tan
Ambuj Tewari
OffRL
51
2
0
27 May 2024
How to Leverage Diverse Demonstrations in Offline Imitation Learning
How to Leverage Diverse Demonstrations in Offline Imitation Learning
Sheng Yue
Jiani Liu
Xingyuan Hua
Ju Ren
Sen Lin
Junshan Zhang
Yaoxue Zhang
OffRL
34
3
0
24 May 2024
Ensemble Successor Representations for Task Generalization in
  Offline-to-Online Reinforcement Learning
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning
Changhong Wang
Xudong Yu
Chenjia Bai
Qiaosheng Zhang
Zhen Wang
40
1
0
12 May 2024
Reverse Forward Curriculum Learning for Extreme Sample and Demonstration
  Efficiency in Reinforcement Learning
Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning
Stone Tao
Arth Shukla
Tse-kai Chan
Hao Su
OffRL
41
4
0
06 May 2024
DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from
  Offline Datasets
DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets
Xiaoyu Huang
Yufeng Chi
Ruofeng Wang
Zhongyu Li
Xue Bin Peng
Sophia Shao
Borivoje Nikolic
Koushil Sreenath
OffRL
83
27
0
30 Apr 2024
Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models
Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models
Xingyuan Zhang
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
50
0
0
29 Apr 2024
ASID: Active Exploration for System Identification in Robotic
  Manipulation
ASID: Active Exploration for System Identification in Robotic Manipulation
Marius Memmel
Andrew Wagenmaker
Chuning Zhu
Patrick Yin
Dieter Fox
Abhishek Gupta
42
13
0
18 Apr 2024
A Natural Extension To Online Algorithms For Hybrid RL With Limited
  Coverage
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage
Kevin Tan
Ziping Xu
OffRL
OnRL
42
5
0
07 Mar 2024
SELFI: Autonomous Self-Improvement with Reinforcement Learning for
  Social Navigation
SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation
Noriaki Hirose
Dhruv Shah
Kyle Stachowicz
A. Sridhar
Sergey Levine
71
5
0
01 Mar 2024
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for
  Efficiency
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
Yanxiao Zhao
Yangge Qian
Tianyi Wang
Jingyang Shan
Xiaolin Qin
29
0
0
01 Mar 2024
Foundation Policies with Hilbert Representations
Foundation Policies with Hilbert Representations
Seohong Park
Tobias Kreiman
Sergey Levine
SSL
OffRL
55
21
0
23 Feb 2024
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic
  Manipulation
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation
Wilbert Pumacay
Ishika Singh
Jiafei Duan
Ranjay Krishna
Jesse Thomason
Dieter Fox
29
40
0
13 Feb 2024
12
Next