ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.10282
  4. Cited By
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid
  Reinforcement Learning

Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning

17 May 2023
Gen Li
Wenhao Zhan
Jason D. Lee
Yuejie Chi
Yuxin Chen
    OffRL
    OnRL
ArXivPDFHTML

Papers citing "Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning"

30 / 30 papers shown
Title
On The Statistical Complexity of Offline Decision-Making
On The Statistical Complexity of Offline Decision-Making
Thanh Nguyen-Tang
R. Arora
OffRL
171
1
0
10 Jan 2025
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
149
23
0
25 Jul 2023
Leveraging Offline Data in Online Reinforcement Learning
Leveraging Offline Data in Online Reinforcement Learning
Andrew Wagenmaker
Aldo Pacchiano
OffRL
OnRL
52
40
0
09 Nov 2022
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Yuda Song
Yi Zhou
Ayush Sekhari
J. Andrew Bagnell
A. Krishnamurthy
Wen Sun
OffRL
OnRL
63
102
0
13 Oct 2022
Distributionally Robust Model-Based Offline Reinforcement Learning with
  Near-Optimal Sample Complexity
Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity
Laixi Shi
Yuejie Chi
OOD
OffRL
67
69
0
11 Aug 2022
Safe Exploration Incurs Nearly No Additional Sample Complexity for
  Reward-free RL
Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL
Ruiquan Huang
J. Yang
Yingbin Liang
OffRL
70
9
0
28 Jun 2022
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear
  RL
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL
Jinglin Chen
Aditya Modi
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
72
25
0
21 Jun 2022
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards
  Optimal Sample Complexity
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
67
93
0
28 Feb 2022
Offline Reinforcement Learning with Realizability and Single-policy
  Concentrability
Offline Reinforcement Learning with Realizability and Single-policy Concentrability
Wenhao Zhan
Baihe Huang
Audrey Huang
Nan Jiang
Jason D. Lee
OffRL
310
111
0
09 Feb 2022
When is Offline Two-Player Zero-Sum Markov Game Solvable?
When is Offline Two-Player Zero-Sum Markov Game Solvable?
Qiwen Cui
S. Du
OffRL
64
29
0
10 Jan 2022
The Statistical Complexity of Interactive Decision Making
The Statistical Complexity of Interactive Decision Making
Dylan J. Foster
Sham Kakade
Jian Qian
Alexander Rakhlin
289
178
0
27 Dec 2021
Reward-Free Model-Based Reinforcement Learning with Linear Function
  Approximation
Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation
Weitong Zhang
Dongruo Zhou
Quanquan Gu
OffRL
51
28
0
12 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
63
52
0
09 Oct 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale
  of Pessimism
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
194
286
0
22 Mar 2021
Bilinear Classes: A Structural Framework for Provable Generalization in
  RL
Bilinear Classes: A Structural Framework for Provable Generalization in RL
S. Du
Sham Kakade
Jason D. Lee
Shachar Lovett
G. Mahajan
Wen Sun
Ruosong Wang
OffRL
137
189
0
19 Mar 2021
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov
  Decision Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes
Dongruo Zhou
Quanquan Gu
Csaba Szepesvári
66
206
0
15 Dec 2020
Fast active learning for pure exploration in reinforcement learning
Fast active learning for pure exploration in reinforcement learning
Pierre Ménard
O. D. Domingues
Anders Jonsson
E. Kaufmann
Edouard Leurent
Michal Valko
40
95
0
27 Jul 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank
  MDPs
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
135
225
0
18 Jun 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
88
607
0
16 Jun 2020
Adaptive Reward-Free Exploration
Adaptive Reward-Free Exploration
E. Kaufmann
Pierre Ménard
O. D. Domingues
Anders Jonsson
Edouard Leurent
Michal Valko
48
81
0
11 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
131
1,806
0
08 Jun 2020
Dota 2 with Large Scale Deep Reinforcement Learning
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI OpenAI
:
Christopher Berner
Greg Brockman
Brooke Chan
...
Szymon Sidor
Ilya Sutskever
Jie Tang
Filip Wolski
Susan Zhang
GNN
VLM
CLL
AI4CE
LRM
140
1,820
0
13 Dec 2019
Provably Efficient Reinforcement Learning with Linear Function
  Approximation
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
86
555
0
11 Jul 2019
Information-Theoretic Considerations in Batch Reinforcement Learning
Information-Theoretic Considerations in Batch Reinforcement Learning
Jinglin Chen
Nan Jiang
OOD
OffRL
133
375
0
01 May 2019
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon
  MDP
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
Kefan Dong
Yuanhao Wang
Xiaoyu Chen
Liwei Wang
OffRL
55
95
0
27 Jan 2019
Overcoming Exploration in Reinforcement Learning with Demonstrations
Overcoming Exploration in Reinforcement Learning with Demonstrations
Ashvin Nair
Bob McGrew
Marcin Andrychowicz
Wojciech Zaremba
Pieter Abbeel
OffRL
86
783
0
28 Sep 2017
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning
  and Demonstrations
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
Aravind Rajeswaran
Vikash Kumar
Abhishek Gupta
Giulia Vezzani
John Schulman
E. Todorov
Sergey Levine
126
1,093
0
28 Sep 2017
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement
  Learning
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
72
308
0
22 Mar 2017
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
Shai Shalev-Shwartz
Shaked Shammah
Amnon Shashua
86
833
0
11 Oct 2016
Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
114
12,201
0
19 Dec 2013
1