ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.15669
  4. Cited By
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement
  Learning

PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning

25 May 2023
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Ya Zhang
    OffRLOnRL
ArXiv (abs)PDFHTML

Papers citing "PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning"

32 / 32 papers shown
Title
Dual Alignment Maximin Optimization for Offline Model-based RL
Dual Alignment Maximin Optimization for Offline Model-based RL
Chi Zhou
Wang Luo
Haoran Li
Congying Han
Tiande Guo
Zicheng Zhang
OffRL
143
0
0
02 Feb 2025
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRLOnRL
131
0
0
19 Oct 2024
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs
  and Practical Solutions
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Yicheng Luo
Jackie Kay
Edward Grefenstette
M. Deisenroth
OffRLOnRL
67
16
0
30 Mar 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value
  Regularization
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
90
85
0
28 Mar 2023
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRLOnRL
69
24
0
14 Mar 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg
Joey Hejna
Matthieu Geist
Stefano Ermon
OffRL
73
80
0
05 Jan 2023
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online
  Reinforcement Learning
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning
Yi Zhao
Rinu Boney
Alexander Ilin
Arno Solin
Joni Pajarinen
OffRLOnRL
79
40
0
25 Oct 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Rameswar Panda
Marc G. Bellemare
OffRLOnRL
117
65
0
03 Jun 2022
Jump-Start Reinforcement Learning
Jump-Start Reinforcement Learning
Ikechukwu Uchendu
Ted Xiao
Yao Lu
Banghua Zhu
Mengyuan Yan
...
Chuyuan Fu
Cong Ma
Jiantao Jiao
Sergey Levine
Karol Hausman
OffRLOnRL
86
115
0
05 Apr 2022
Supported Policy Optimization for Offline Reinforcement Learning
Supported Policy Optimization for Offline Reinforcement Learning
Jialong Wu
Haixu Wu
Zihan Qiu
Jianmin Wang
Mingsheng Long
OffRL
79
70
0
13 Feb 2022
Online Decision Transformer
Online Decision Transformer
Qinqing Zheng
Amy Zhang
Aditya Grover
OffRL
81
209
0
11 Feb 2022
d3rlpy: An Offline Deep Reinforcement Learning Library
d3rlpy: An Offline Deep Reinforcement Learning Library
Takuma Seno
M. Imai
OffRLGP
108
106
0
06 Nov 2021
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
307
928
0
12 Oct 2021
Offline-to-Online Reinforcement Learning via Balanced Replay and
  Pessimistic Q-Ensemble
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
Seunghyun Lee
Younggyo Seo
Kimin Lee
Pieter Abbeel
Jinwoo Shin
OffRLOnRL
65
191
0
01 Jul 2021
A Minimalist Approach to Offline Reinforcement Learning
A Minimalist Approach to Offline Reinforcement Learning
Scott Fujimoto
S. Gu
OffRL
132
830
0
12 Jun 2021
Is Pessimism Provably Efficient for Offline RL?
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
187
360
0
30 Dec 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRLOnRL
121
617
0
16 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRLOnRL
146
1,836
0
08 Jun 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GPOffRL
237
1,384
0
15 Apr 2020
CURL: Contrastive Unsupervised Representations for Reinforcement
  Learning
CURL: Contrastive Unsupervised Representations for Reinforcement Learning
A. Srinivas
Michael Laskin
Pieter Abbeel
SSLDRLOffRL
119
1,092
0
08 Apr 2020
Deep Reinforcement Learning for Autonomous Driving: A Survey
Deep Reinforcement Learning for Autonomous Driving: A Survey
B. R. Kiran
Ibrahim Sobh
V. Talpaert
Patrick Mannion
A. A. Sallab
S. Yogamani
P. Pérez
358
1,693
0
02 Feb 2020
Imitation Learning via Off-Policy Distribution Matching
Imitation Learning via Off-Policy Distribution Matching
Ilya Kostrikov
Ofir Nachum
Jonathan Tompson
OODOffRL
158
205
0
10 Dec 2019
Advantage-Weighted Regression: Simple and Scalable Off-Policy
  Reinforcement Learning
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
OffRL
157
570
0
01 Oct 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
140
1,067
0
03 Jun 2019
Challenges of Real-World Reinforcement Learning
Challenges of Real-World Reinforcement Learning
Gabriel Dulac-Arnold
D. Mankowitz
Todd Hester
OffRL
96
551
0
29 Apr 2019
Addressing Function Approximation Error in Actor-Critic Methods
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
195
5,226
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
317
8,432
0
04 Jan 2018
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
580
19,315
0
20 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
82
107
0
06 Jul 2017
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
118
1,350
0
27 Feb 2017
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
435
10,541
0
21 Jul 2016
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
330
13,295
0
09 Sep 2015
1