Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.04895
Cited By
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
9 June 2021
Tengyang Xie
Nan Jiang
Huan Wang
Caiming Xiong
Yu Bai
OffRL
OnRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning"
50 / 50 papers shown
Title
MILE: Model-based Intervention Learning
Yigit Korkmaz
Erdem Bıyık
112
2
0
21 Feb 2025
On The Statistical Complexity of Offline Decision-Making
Thanh Nguyen-Tang
R. Arora
OffRL
140
1
0
10 Jan 2025
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson
Qiyang Li
Kevin Frans
Sergey Levine
SSL
OffRL
OnRL
124
0
0
23 Oct 2024
Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning
Yen-Ru Lai
Fu-Chieh Chang
Pei-Yuan Wu
OffRL
98
1
0
22 Aug 2024
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
125
22
0
25 Jul 2023
On Bonus-Based Exploration Methods in the Arcade Learning Environment
Adrien Ali Taïga
W. Fedus
Marlos C. Machado
Aaron Courville
Marc G. Bellemare
40
59
0
22 Sep 2021
Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
61
49
0
25 Mar 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
156
284
0
22 Mar 2021
Bilinear Classes: A Structural Framework for Provable Generalization in RL
S. Du
Sham Kakade
Jason D. Lee
Shachar Lovett
G. Mahajan
Wen Sun
Ruosong Wang
OffRL
99
189
0
19 Mar 2021
MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning
DiJia Su
Jason D. Lee
John M. Mulvey
H. Vincent Poor
OffRL
26
6
0
23 Feb 2021
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction
Ming Yin
Yu Bai
Yu Wang
OffRL
51
67
0
02 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
76
216
0
01 Feb 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
177
167
0
06 Jan 2021
A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost
Minbo Gao
Tianle Xie
S. Du
Lin F. Yang
48
46
0
02 Jan 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
105
352
0
30 Dec 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
59
18
0
09 Nov 2020
Learning Quadrupedal Locomotion over Challenging Terrain
Joonho Lee
Jemin Hwangbo
Lorenz Wellhausen
V. Koltun
Marco Hutter
92
1,154
0
21 Oct 2020
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited
O. D. Domingues
Pierre Ménard
E. Kaufmann
Michal Valko
38
96
0
07 Oct 2020
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
Qinghua Liu
Tiancheng Yu
Yu Bai
Chi Jin
58
122
0
04 Oct 2020
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration
Andrea Zanette
A. Lazaric
Mykel J. Kochenderfer
Emma Brunskill
55
64
0
18 Aug 2020
Provably Good Batch Reinforcement Learning Without Great Exploration
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
89
105
0
16 Jul 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
75
31
0
07 Jul 2020
Near-Optimal Reinforcement Learning with Self-Play
Yunru Bai
Chi Jin
Tiancheng Yu
121
131
0
22 Jun 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
92
225
0
18 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
104
1,780
0
08 Jun 2020
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
T. Matsushima
Hiroki Furuta
Y. Matsuo
Ofir Nachum
S. Gu
OffRL
43
147
0
05 Jun 2020
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
67
759
0
27 May 2020
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
75
662
0
12 May 2020
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition
Zihan Zhang
Yuanshuo Zhou
Xiangyang Ji
OffRL
53
156
0
21 Apr 2020
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison
Tengyang Xie
Nan Jiang
58
35
0
09 Mar 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
63
222
0
29 Feb 2020
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
68
678
0
26 Nov 2019
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
73
1,215
0
16 Oct 2019
Emergent Tool Use From Multi-Agent Autocurricula
Bowen Baker
I. Kanitscheider
Todor Markov
Yi Wu
Glenn Powell
Bob McGrew
Igor Mordatch
LRM
64
649
0
17 Sep 2019
Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment
Adrien Ali Taïga
W. Fedus
Marlos C. Machado
Aaron Courville
Marc G. Bellemare
49
40
0
06 Aug 2019
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
79
549
0
11 Jul 2019
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
57
939
0
19 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
79
1,044
0
03 Jun 2019
Provably Efficient Q-Learning with Low Switching Cost
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
39
93
0
30 May 2019
Information-Theoretic Considerations in Batch Reinforcement Learning
Jinglin Chen
Nan Jiang
OOD
OffRL
93
373
0
01 May 2019
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
157
1,586
0
07 Dec 2018
Policy Certificates: Towards Accountable Reinforcement Learning
Christoph Dann
Ashutosh Adhikari
Wei Wei
Jimmy J. Lin
OffRL
94
142
0
07 Nov 2018
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
52
801
0
10 Jul 2018
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation
Dmitry Kalashnikov
A. Irpan
P. Pastor
Julian Ibarz
Alexander Herzog
...
Deirdre Quillen
E. Holly
Mrinal Kalakrishnan
Vincent Vanhoucke
Sergey Levine
98
1,454
0
27 Jun 2018
Rainbow: Combining Improvements in Deep Reinforcement Learning
Matteo Hessel
Joseph Modayil
H. V. Hasselt
Tom Schaul
Georg Ostrovski
Will Dabney
Dan Horgan
Bilal Piot
M. G. Azar
David Silver
OffRL
94
2,255
0
06 Oct 2017
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
62
307
0
22 Mar 2017
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
68
771
0
16 Mar 2017
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
90
417
0
29 Oct 2016
Model-based Reinforcement Learning and the Eluder Dimension
Ian Osband
Benjamin Van Roy
65
188
0
07 Jun 2014
Empirical Bernstein Bounds and Sample Variance Penalization
Andreas Maurer
Massimiliano Pontil
167
540
0
21 Jul 2009
1